File tree 5 files changed +17
-6
lines changed
5 files changed +17
-6
lines changed Original file line number Diff line number Diff line change
1
+ # this file helps control behavior of "git archive", which also used by github release
2
+ ut_data / export-ignore
3
+
4
+ version.txt export-subst
5
+
Original file line number Diff line number Diff line change 1
1
/models
2
2
* .swp
3
3
/.idea
4
+ .DS_Store
4
5
/config.yml
5
6
# Byte-compiled / optimized / DLL files
6
7
__pycache__ /
Original file line number Diff line number Diff line change @@ -31,7 +31,9 @@ models:
31
31
params:
32
32
path: /absolute/path/to/your/7B/ggml-model-q4_0.bin
33
33
EOF
34
- python -m python -m llama_api_server
34
+
35
+ # start web server
36
+ python -m llama_api_server
35
37
```
36
38
37
39
### Call with openai-python
@@ -49,13 +51,13 @@ openai api completions.create -e text-davinci-003 -p "hello?"
49
51
50
52
#### Supported APIs
51
53
- [X] Completions
52
- - [X] set ` temperature ` , ` top\_p ` , and ` top\_k `
53
- - [X] set ` max\_tokens `
54
+ - [X] set ` temperature ` , ` top_p ` , and ` top_k `
55
+ - [X] set ` max_tokens `
54
56
- [ ] set ` stop `
55
57
- [ ] set ` stream `
56
58
- [ ] set ` n `
57
- - [ ] set ` presence\_penalty ` and ` frequency\_penalty `
58
- - [ ] set ` logit\_bias `
59
+ - [ ] set ` presence_penalty ` and ` frequency_penalty `
60
+ - [ ] set ` logit_bias `
59
61
- [X] Embeddings
60
62
- [X] batch process
61
63
- [ ] Chat
@@ -64,8 +66,8 @@ openai api completions.create -e text-davinci-003 -p "hello?"
64
66
- [X] [ llama.cpp] ( https://github.com/ggerganov/llama.cpp ) via [ llamacpp-python] ( https://github.com/thomasantony/llamacpp-python )
65
67
66
68
#### Others
69
+ - [X] Performance parameters like ` n_batch ` and ` n_thread `
67
70
- [ ] Documents
68
71
- [ ] Token auth
69
72
- [ ] Intergration tests
70
- - [ ] Performance parameters like ` n_batch ` and ` n_thread `
71
73
- [ ] A tool to download/prepare pretrain model
Original file line number Diff line number Diff line change @@ -28,6 +28,7 @@ dependencies = [
28
28
" llamacpp>=0.1.11" ,
29
29
" Flask>=2.0.0" ,
30
30
" numpy" ,
31
+ " pyyaml" ,
31
32
]
32
33
33
34
[project .urls ]
Original file line number Diff line number Diff line change
1
+ $Format:%h %d$
2
+
You can’t perform that action at this time.
0 commit comments