The csv package provides two complementary features:
-
CSV Validation using struct tags
- Detect invalid values with precise error positions
- Multi-language error messages (English, Japanese, Russian)
- Validation rules expressed through
validate:tags
-
A pandas-like DataFrame engine for CSV processing
- Inspired by the author’s experience learning machine learning with pandas
- Built on top of
nao1215/filesql - Brings SQL-backed, pandas-like transformations into pure Go
- Suitable for data exploration, ETL preprocessing, and joining multiple CSV files without Python
Both modules can be used independently.
- Linux
- macOS
- Windows
- Go 1.24 or later
- English
- Japanese
- Russian
Validation rules are written using struct tags.
The order of struct fields must match the column order of the CSV file.
input := `id,name,age,password,password_confirm,role,note,nickname,ip,cidr,url
1,Alice,17,Secret123,Secret12,superuser,"TODO: fix",alice!,999.0.0.1,10.0.0.0/33,http://example.com
-5,Bob,30,short,short,admin,"Note: ready",Bob123,192.168.0.1,192.168.0.0/24,https://example.com
`
type account struct {
ID int `validate:"number,gte=1"`
Name string `validate:"alpha"`
Age int `validate:"number,gte=18,lte=65"`
Password string `validate:"required,gte=8"`
PasswordConfirm string `validate:"eqfield=Password"`
Role string `validate:"oneof=admin user"`
Note string `validate:"excludes=TODO,startswith=Note"`
Nickname string `validate:"alphanumunicode"`
IP string `validate:"ip4_addr"`
CIDR string `validate:"cidrv4"`
URL string `validate:"https_url"`
}
var accounts []account
buf := bytes.NewBufferString(input)
csvReader, _ := csv.NewCSV(buf)
errs := csvReader.Decode(&accounts)
for _, err := range errs {
fmt.Println(err.Error())
}Output:
line:2 column age: target is not greater than or equal to the threshold value: threshold=18, value=17
line:2 column password: target is not greater than or equal to the threshold value: value=Secret123
line:2 column role: target is not one of the values: oneof=admin user, value=superuser
line:2 column note: target contains a prohibited substring: excludes=TODO, value=TODO: fix
line:2 column note: target does not start with the specified value: startswith=Note, value=TODO: fix
line:2 column nickname: target is not an alphanumeric unicode character: value=alice!
line:2 column ip: target is not a valid IPv4 address: value=999.0.0.1
line:2 column cidr: target is not a valid IPv4 CIDR: value=10.0.0.0/33
line:2 column url: target is not a valid HTTPS URL: value=http://example.com
line:2 column password_confirm: field is not equal to the specified field: field=PasswordConfirm, other=Password
line:3 column id: target is not greater than or equal to the threshold value: threshold=1, value=-5
line:3 column password: target is not greater than or equal to the threshold value: value=short
| Tag Name | Description |
|---|---|
| eqfield | Equal to another field in the same row |
| fieldcontains | Field contains the value of another field (same row) |
| fieldexcludes | Field does not contain the value of another field (same row) |
| gtefield | Field Greater Than or Equal To Another Field (same row) |
| gtfield | Greater than another field in the same row |
| ltefield | Less or equal to another field in the same row |
| ltfield | Less than another field in the same row |
| nefield | Not equal to another field in the same row |
| Tag Name | Description |
|---|---|
| alpha | Alphabetic characters only |
| alphaunicode | Unicode alphabetic characters |
| alphanumeric | Alphanumeric characters |
| alphanumunicode | Unicode letters and digits |
| alphaspace | Alphabetic characters and spaces |
| ascii | ASCII characters only |
| boolean | Boolean values |
| contains | Contains substring |
| containsany | Contains any of the specified characters |
| containsrune | Contains the specified rune |
| endsnotwith | Must not end with the specified substring |
| endswith | Ends with the specified substring |
| excludes | Must not contain the specified substring |
| excludesall | Must not contain any of the specified runes |
| excludesrune | Must not contain the specified rune |
| lowercase | Lowercase only |
| multibyte | Contains at least one multibyte character |
| number | Signed integer or decimal number |
| numeric | Numeric only |
| datauri | Valid Data URI (data:*;base64,…) |
| printascii | Printable ASCII characters only |
| startsnotwith | Must not start with the specified substring |
| startswith | Starts with the specified substring |
| uppercase | Uppercase only |
| Tag Name | Description |
|---|---|
| Valid email address | |
| uri | Valid URI (scheme required, host optional) |
| url | Valid URL (scheme required; file: allows path-only, other schemes require URL shape) |
| http_url | Valid HTTP(S) URL with host |
| https_url | Valid HTTPS URL with host |
| url_encoded | URL-encoded string (percent escapes, no malformed % sequences) |
| datauri | Valid Data URI (data:*;base64,…) |
| fqdn | Valid fully qualified domain name |
| hostname | Hostname (RFC 952) |
| hostname_rfc1123 | Hostname (RFC 1123) |
| hostname_port | Host and port combination |
| uuid | UUID string (with hyphens) |
| Tag Name | Description |
|---|---|
| cidr | Valid CIDR (IPv4 or IPv6) |
| cidrv4 | Valid IPv4 CIDR |
| cidrv6 | Valid IPv6 CIDR |
| ip_addr | IPv4 or IPv6 address |
| ip4_addr | IPv4 address only |
| ip6_addr | IPv6 address only |
| Tag Name | Description |
|---|---|
| eq | Equal to the specified value |
| eq_ignore_case | Equal to the specified value (case-insensitive) |
| gt | Greater than |
| gte | Greater or equal |
| lt | Less than |
| lte | Less or equal |
| ne_ignore_case | Not equal to the specified value (case-insensitive) |
| ne | Not equal |
| Tag Name | Description |
|---|---|
| len | Exact length |
| max | Maximum value |
| min | Minimum value |
| oneof | Must match one of given values |
| required | Must not be empty |
The DataFrame API provides pandas-like transformations for CSV files using lazy SQL execution.
| Method | Description |
|---|---|
func (df DataFrame) Select(cols ...string) DataFrame |
Select columns |
func (df DataFrame) Filter(expr string) DataFrame |
Filter rows using SQL expressions |
func (df DataFrame) Drop(cols ...string) DataFrame |
Drop columns |
func (df DataFrame) Rename(map[string]string) DataFrame |
Rename columns |
func (df DataFrame) Mutate(col, expr string) DataFrame |
Create derived columns |
func (df DataFrame) Sort(col string, asc bool) DataFrame |
Sort rows |
func (df DataFrame) DropNA(cols ...string) DataFrame |
Remove rows with NULL values |
func (df DataFrame) FillNA(col string, value any) DataFrame |
Fill NULL values |
func (df DataFrame) Cast(col, dtype string) DataFrame |
Cast column type |
func (df DataFrame) Join(other DataFrame, key string) DataFrame |
Inner join |
func (df DataFrame) LeftJoin(other DataFrame, key string) DataFrame |
Left join |
func (df DataFrame) RightJoin(other DataFrame, key string) DataFrame |
Right join |
func (df DataFrame) FullJoin(other DataFrame, key string) DataFrame |
Full join |
func (df DataFrame) Merge(other DataFrame, opts MergeOptions) DataFrame |
pandas-like merge |
func (df DataFrame) Rows() ([]map[string]any, error) |
Evaluate query and return results |
func (df DataFrame) Head(n int) ([]map[string]any, error) |
Return first n rows |
func (df DataFrame) Tail(n int) ([]map[string]any, error) |
Return last n rows |
func (df DataFrame) Print(w io.Writer) error |
Pretty-print table |
func (df DataFrame) ToCSV(path string) error |
Write results as CSV |
func (df DataFrame) DebugSQL() string |
Show generated SQL |
func (df DataFrame) Columns() []string |
List of columns |
func (df DataFrame) Shape() (int, int) |
(rows, columns) |
Notes
DataFrame.Filterinlines the provided SQL expression verbatim. Always sanitize or parameterize user input before passing it toFilter.RightJoin/FullJoinrequire backend support. SQLite (used by filesql) does not implement RIGHT/FULL OUTER JOIN, so those methods will return SQL errors on SQLite-based setups.
package main
import (
"fmt"
"log"
"os"
"github.com/nao1215/csv"
)
func main() {
if err := os.WriteFile("people.csv", []byte("id,name,age\n1,Alice,23\n2,Bob,30\n"), 0o600); err != nil {
log.Fatal(err)
}
df := csv.NewDataFrame("people.csv").
Select("name", "age").
Filter("age >= 25").
Mutate("decade", "age / 10").
Sort("age", true)
rows, err := df.Rows()
if err != nil {
log.Fatal(err)
}
fmt.Println(rows[0]["name"], rows[0]["decade"])
}MIT License. See LICENSE.
Contributions are welcome! See CONTRIBUTING.md.
Star support is greatly appreciated.
I was inspired by the following OSS. Thank you for your great work!!
CHIKAMATSU Naohiro 📖 |
This project follows the all-contributors specification.
