You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is some information in our documentation regarding how to use user defined functions in pandas. The API pages of the used methods, and these sections:
My understanding is that we've been mostly discouraging the use of functions like apply, or at least the community has with many posts and comments regarding apply is slow, which seem fair. With the work going on supporting JIT compilers on these functions (see #54666 and #61032) this can hopefully change, and allow in some cases for clearer code while not compromising speed.
I think it may be difficult to communicate all the information related to udf in the existing sections on group by and FAQ pages and in the API docs. A dedicated page in the users guide that guides users on when to use udf, a general idea of the API, the differences between the different methods, the options available... seems a better idea.
Also, the APIs of the different methods are quite inconsistent, and in some cases cumbersome. I think writing this page will be a good exercise to identify cases when explaining the functionality to the users is complex and not intuitive, and see if we can address them.
The text was updated successfully, but these errors were encountered:
I'm interested in working on this user guide. Just to clarify, this user guide should provide guidance on everything about udfs (when to use them, their differences, etc), rather than just document what they do, correct?
There is some information in our documentation regarding how to use user defined functions in pandas. The API pages of the used methods, and these sections:
My understanding is that we've been mostly discouraging the use of functions like apply, or at least the community has with many posts and comments regarding
apply
is slow, which seem fair. With the work going on supporting JIT compilers on these functions (see #54666 and #61032) this can hopefully change, and allow in some cases for clearer code while not compromising speed.I think it may be difficult to communicate all the information related to udf in the existing sections on group by and FAQ pages and in the API docs. A dedicated page in the users guide that guides users on when to use udf, a general idea of the API, the differences between the different methods, the options available... seems a better idea.
Also, the APIs of the different methods are quite inconsistent, and in some cases cumbersome. I think writing this page will be a good exercise to identify cases when explaining the functionality to the users is complex and not intuitive, and see if we can address them.
The text was updated successfully, but these errors were encountered: