GPT assistant file의 업로드 오류 메세지

2023년 11월 17일 Hong 댓글 남기기

OpenAI GPT API로 assistant file을 업로드하다 오류가 발생하면 아래와 같이 매우 장황한 설명을 합니다.

오류의 대부분은 다음과 같은 이유가 대부분입니다.

PDF형식이 이상해서 텍스트를 추출하지 못함
파일이 깨짐
파일 요량 초과

메세지는 기록을 위해서 여기에 저장해 둡니다.

It appears that the content extracted from the PDF file is not meaningful text, but rather a mix of symbols and characters, which likely means that the text within the PDF is either stored in a non-standard format, such as images or custom encoding, or the file may contain primarily images rather than text.

Unfortunately, without a more sophisticated PDF analysis tool that can handle optical character recognition (OCR) to interpret any images as text, or specific knowledge of the file’s encoding, we might not be able to extract readable content from the PDF in this environment.

Nonetheless, if the PDF file contains information on how to create an account and register app information with “xxxxxxxxx,” you would typically follow the instructions provided within the document, which usually involve navigating to a specific URL and filling out forms with the necessary information.

If you have a specific tool or method in mind for attempting to parse the PDF content further, or if the file should be used in a particular context, please let me know and I’ll do my best to assist you further.

*****************************************************************************

The hexadecimal representation of the first few bytes of the file is `’25504446’`, which corresponds to the magic number for a PDF (Portable Document Format) file. The magic number `0x25504446` stands for `%PDF` in ASCII, which is the signature for PDF files.

Now that we know it’s a PDF file, we can attempt to extract its text content (if it contains any) using a PDF reading library. Let’s try that next.

*****************************************************************************

The file does not have an extension, making it challenging to determine its type based on the name alone. Since I wasn’t able to open it as a text file directly due to an encoding issue, I will now attempt to use a different method to read the file as binary data and then analyze the first few bytes to infer the file type. This is often referred to as checking the “magic number” of the file, which can give us clues about the file format. Let’s proceed with this method.

*****************************************************************************

My apologies for the confusion, it seems that the python-magic module I intended to use to detect the file type is not available in this environment. Let me try another approach to determine the file type. I will first attempt to check the file extension and then try a different way to check the content type if necessary.

*****************************************************************************

It looks like the file may not be a text file or it might not be encoded in UTF-8. The next step is to try to infer the file type or encoding. Let’s start by attempting to detect the file type based on its content.

I will now attempt to analyze the file to identify its type and then proceed accordingly. If it is a binary file, such as an image or an executable, we won’t be able to directly view its content as text. If it’s an archive or a document, we might need to take additional steps to extract and view the content. Let’s determine the file type first.

미분류

우분투에 포스트그레스 설치 – Install PostgreSQL on Ubuntu Linux

2023년 10월 30일 Hong 댓글 남기기

아래의 순서로 하면됩니다.

# Create the file repository configuration:
sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

# Import the repository signing key:
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

# Update the package lists:
sudo apt-get update

# Install the latest version of PostgreSQL.
# If you want a specific version, use 'postgresql-12' or similar instead of 'postgresql':
sudo apt-get -y install postgresql

자세한 내용은 아래 원문을 참고하세요.

https://www.postgresql.org/download/linux/ubuntu/

미분류

Rstudio server에서 Copilot 사용하기

2023년 10월 06일 Hong 댓글 남기기

Rstudio 2023년 9월 28일 릴리즈 버전 이후부터는 Github copilot을 정식지원합니다.

neovim의 plugin을 이용해서 연동한 것인데 깔끔하게 잘 작동합니다.

Rstduio desktop을 설치하면 옵션에서 설정하고 로그인만한 하면 그냥 잘 작동하는데 Rstudio server는 옵션을 활성화하려고 하면 어드민이 허용을 해주지 않았다는 메세지가 나옵니다.

그래서 추가 설정이 필요합니다.

리눅스 서버에 ssh로 접속해서 /etc/rstudion/rsession.conf을 열어줍니다.

이 파일을 Rstudio server가 사용자별로 세션을 새로 로딩할 때 초기화하기 위해 사용하는 파일입니다.

sudo vim /etc/rstudio/rsession.conf

그리고 다음 줄을 추가해줍니다.

copilot-enabled=1

옵션에서 github copilot을 활성화하고 로그인을 합니다.

아래와 같이 나오면 된 것입니다.

리눅스 Linux, 미분류

command ‘x86_64-linux-gnu-gcc’ failed with exit status 1

2023년 09월 22일 Hong 댓글 남기기

WSL이나 Linux를 새로 설치한 후에 Python 패키지를 설치할 때 저런 에러가 내면 패키지를 빌드하는데 필요한 패키지가 설치 안된 것입니다.

다음과 같이 패키지를 설치해 주세요.

sudo apt-get install build-essential libssl-dev libffi-dev python3-dev

미분류

Azure OpenAI GPT API 의 민감정보 필터링 정보

2023년 08월 23일 Hong 댓글 남기기

GPT API중 completion 결과 중에 hate, self_harm, sexual, violence 이 4가지에 대한 민검정보 필터링에 대환 결과 여부와 등급이 보이는 것을 볼 수 있습니다.

이 정보를 활용하면 민감한 정보에 대한 답 자체를 보여주지 않거나 할 수 있습니다.

{
    "index": 0,
    "finish_reason": "stop",
    "message": {
        "role": "assistant",
        "content": "...",
    },
    "content_filter_results": {
        "hate": {"filtered": false, "severity": "safe"},
        "self_harm": {"filtered": false, "severity": "safe"},
        "sexual": {"filtered": false, "severity": "safe"},
        "violence": {"filtered": false, "severity": "safe"},
    },
}