Update pyTigerGraphLoading.py to add support on direct data loading #260

chengbiao-jin · 2024-10-23T20:18:45Z

Add function to support data loading from a string directly instead of a file.

parkererickson-tg

we auto generate the docs from docstrings so we need to have full docstring on for runLoadingJobWithData. Is the data parameter a true string or a bytestring (which is what we read the filepath as with runLoadingJobWithData)

pyTigerGraph/pyTigerGraphLoading.py

songting-chen · 2024-10-23T21:07:10Z

pyTigerGraph/pyTigerGraphLoading.py

+        FILENAME definition will be updated to point to the data received.
+
+        NOTE: The argument `USING HEADER="true"` in the GSQL loading job may not be enough to
+        load the file correctly. Remove the header from the data file before using this function.


the comment is confusing, same for runLoadingJobWithFile

So header should be removed before calling these two functions. If loading job still has Using header=true, will the first line be ignored?

Original function does not support it hence I did not make any change on it yet.

Actually I'd prefer to support HEADER=true in these 2 functions hence user can provide the parameters according to the loading job. @parkererickson-tg do you have any background information on why it might not loading correctly with HEADER specified?

It is a long-time bug in the ddl system

header=False will be required in df.to_csv() in this case.

ok, make it explicit that USING HEADER=false in loading job definition.

HEADER=false is actually the default behavior

Right. I mean user should not set USING HEADER=true in loading job in this case? otherwise they will lose 1 row?

parkererickson-tg

Can we get some unit tests on this too?

pyTigerGraph/pyTigerGraphLoading.py

qe-tigergraph

Unit Test: FAILURE, Jenkins_job:http://192.168.99.101:30080/job/mlwb_build/1232/

qe-tigergraph

Unit Test: SUCCESS, e2e Test: SKIPPED, Jenkins_job:http://192.168.99.101:30080/job/mlwb_build/1234/

qe-tigergraph

QE Approved

chengbiao-jin · 2024-10-28T19:33:39Z

Can we get some unit tests on this too?

@parkererickson-tg where do we put the unit test? Is there an example?

parkererickson-tg · 2024-10-28T19:41:24Z

Can we get some unit tests on this too?

@parkererickson-tg where do we put the unit test? Is there an example?

Looks like we actually are missing tests on our entire loading job execution functionality... here is a test file for our vertex functions: https://github.com/tigergraph/pyTigerGraph/blob/master/tests/test_pyTigerGraphVertex.py. We put test fixtures like GSQL files here: https://github.com/tigergraph/pyTigerGraph/blob/master/tests/fixtures/create_query_simple.gsql.

parkererickson-tg · 2024-10-28T20:27:04Z

@chengbiao-jin It would be nice if you could add the support for async functionality as well, as I just merged that PR today, which was a pretty large refactor. If you don't have bandwidth, I can probably pick it up this week.

chengbiao-jin · 2024-10-28T21:59:30Z

I'll find some time work on it tomorrow.

…

On Mon, Oct 28, 2024 at 1:27 PM Parker Erickson ***@***.***> wrote: @chengbiao-jin <https://github.com/chengbiao-jin> It would be nice if you could add the support for async functionality as well, as I just merged that PR today, which was a pretty large refactor. If you don't have bandwidth, I can probably pick it up this week. — Reply to this email directly, view it on GitHub <#260 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKYX4JDXZTL7LTO4OEBFBPTZ52M25AVCNFSM6AAAAABQPTLEI2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBSGU2TMMBXGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Update pyTigerGraphLoading.py

9b8fc1b

Add function to support data loading from a string directly instead of a file.

chengbiao-jin requested a review from qe-tigergraph as a code owner October 23, 2024 20:18

chengbiao-jin requested review from parkererickson-tg and songting-chen October 23, 2024 20:19

parkererickson-tg suggested changes Oct 23, 2024

View reviewed changes

songting-chen reviewed Oct 23, 2024

View reviewed changes

pyTigerGraph/pyTigerGraphLoading.py Show resolved Hide resolved

Update logger

0b725bd

songting-chen reviewed Oct 23, 2024

View reviewed changes

pyTigerGraph/pyTigerGraphLoading.py Outdated Show resolved Hide resolved

chengbiao-jin added 3 commits October 23, 2024 15:44

Update logger

cec2ca3

Update Doc

b4a2f39

Update pyTigerGraphLoading.py

ff76d24

songting-chen reviewed Oct 23, 2024

View reviewed changes

chengbiao-jin added 2 commits October 23, 2024 18:20

Add runLoadingJobWithDF function

5ada406

Update separator

4f3e052

parkererickson-tg suggested changes Oct 24, 2024

View reviewed changes

pyTigerGraph/pyTigerGraphLoading.py Outdated Show resolved Hide resolved

Revise uploadDF function name

696c52c

qe-tigergraph requested changes Oct 25, 2024

View reviewed changes

qe-tigergraph reviewed Oct 25, 2024

View reviewed changes

qe-tigergraph approved these changes Oct 25, 2024

View reviewed changes

chengbiao-jin added 4 commits October 29, 2024 20:58

rebase

1957332

support async loading

0decc2e

support async loading

bb062ec

update format

7cb8aed

parkererickson-tg approved these changes Oct 30, 2024

View reviewed changes

parkererickson-tg merged commit dfc312d into master Oct 30, 2024

parkererickson-tg deleted the cjin_add_data_loading branch October 30, 2024 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update pyTigerGraphLoading.py to add support on direct data loading #260

Update pyTigerGraphLoading.py to add support on direct data loading #260

chengbiao-jin commented Oct 23, 2024

parkererickson-tg left a comment

songting-chen Oct 23, 2024

chengbiao-jin Oct 23, 2024 •

edited

Loading

parkererickson-tg Oct 23, 2024

chengbiao-jin Oct 24, 2024

songting-chen Oct 24, 2024

chengbiao-jin Oct 24, 2024

songting-chen Oct 24, 2024

parkererickson-tg left a comment

qe-tigergraph left a comment

qe-tigergraph left a comment

qe-tigergraph left a comment

chengbiao-jin commented Oct 28, 2024

parkererickson-tg commented Oct 28, 2024

parkererickson-tg commented Oct 28, 2024

chengbiao-jin commented Oct 28, 2024 via email

Update pyTigerGraphLoading.py to add support on direct data loading #260

Update pyTigerGraphLoading.py to add support on direct data loading #260

Conversation

chengbiao-jin commented Oct 23, 2024

parkererickson-tg left a comment

Choose a reason for hiding this comment

songting-chen Oct 23, 2024

Choose a reason for hiding this comment

chengbiao-jin Oct 23, 2024 • edited Loading

Choose a reason for hiding this comment

parkererickson-tg Oct 23, 2024

Choose a reason for hiding this comment

chengbiao-jin Oct 24, 2024

Choose a reason for hiding this comment

songting-chen Oct 24, 2024

Choose a reason for hiding this comment

chengbiao-jin Oct 24, 2024

Choose a reason for hiding this comment

songting-chen Oct 24, 2024

Choose a reason for hiding this comment

parkererickson-tg left a comment

Choose a reason for hiding this comment

qe-tigergraph left a comment

Choose a reason for hiding this comment

qe-tigergraph left a comment

Choose a reason for hiding this comment

qe-tigergraph left a comment

Choose a reason for hiding this comment

chengbiao-jin commented Oct 28, 2024

parkererickson-tg commented Oct 28, 2024

parkererickson-tg commented Oct 28, 2024

chengbiao-jin commented Oct 28, 2024 via email

chengbiao-jin Oct 23, 2024 •

edited

Loading