LLaMa2 - local에서 LLM돌리기

컴퓨터 과학/AI

LLaMa2 - local에서 LLM돌리기

DevHam94 2024. 12. 31. 18:27

https://www.llama.com/llama2/

Meta Llama 2

Llama 2 was pretrained on publicly available online data sources. The fine-tuned model, Llama Chat, leverages publicly available instruction datasets and over 1 million human annotations.

www.llama.com

보통 api로 chat gpt를 이용하면 비용이 청구되는데

LLama2 같이 로컬에서 오피셜 LLM을 돌리게되면 비용청구는 안되지만 어느정도 컴퓨터 스펙이 필요하다.

그러나 똑똑한 사람들이 경량화 버전을 만들었다.

https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML

TheBloke/Llama-2-7B-Chat-GGML · Hugging Face

Llama 2 7B Chat - GGML Description This repo contains GGML format model files for Meta Llama 2's Llama 2 7B Chat. Important note regarding GGML files. The GGML format has now been superseded by GGUF. As of August 21st 2023, llama.cpp no longer supports GGM

huggingface.co

GGML 조지가 만들 ML로 경량화 버전이다. GPU가 없어도 CPU에서도 돌아갈 수 있게 만들었다.

요구하는 스펙이 여러개라 원하는 버전을 받으면된다.

C++ 언어로 구현된 버전이라 이것을 사용할려면 모듈이 하나 더 필요하다.

https://github.com/marella/ctransformers

다운 명령어

pip install ctransformers

from langchain.llms import CTransformers

llm = CTrasformers(
  model="llama-2-7b-chat.ggmlv3.q2_K.bin",
  model_type="llama"
)

이제 llm.predict로 result값을 받아서 출력해주면된다.

LLaMa의 단점은 한글지원이 잘안되서. 영어로 요청을 하는게 좋다.

저작자표시 비영리 변경금지

'컴퓨터 과학 > AI' 카테고리의 다른 글

이미지 생성 AI 사이트 (0)	2025.01.03
Langchain streaming - 실시간으로 답변받기 (0)	2025.01.02
ChatPDF - pdf를 읽어서 chatgpt에게 물어보기 (0)	2024.12.31
LangChain(랭체인) - 어플리케이션 개발 프레임워크 (0)	2024.12.30

현재글LLaMa2 - local에서 LLM돌리기

개발하면서 정리한 블로그 github: https://github.com/DevHam94

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

츄리닝개발자