[FEATURE REQUEST]: Spark 3.0 Readiness #633

suhsteve · 2020-08-19T05:08:57Z

APIs

SparkSession

csharp public static void SetActiveSession(SparkSession session) (Full support for multithreaded applications #641)
csharp public static void ClearActiveSession() (Full support for multithreaded applications #641)
csharp public static SparkSession GetActiveSession() (Full support for multithreaded applications #641)
csharp public DataFrame ExecuteCommand(string runner, string command, Dictionary<string, string> options) (Spark 3.0 readiness part 1 #647)

DataFrame

csharp public DataFrame Transform(Func<DataFrame, DataFrame> func) (Expose DataFrame.Transform #688)
csharp public IEnumerable Tail(int n) (Spark 3.0 readiness part 1 #647)
csharp public void PrintSchema(int level) (Spark 3.0 readiness part 1 #647)
csharp public void Explain(string mode) (Spark 3.0 readiness part 1 #647)
csharp public DataFrame Observe(string name, Column expr, params Column[] exprs) (Spark 3.0 readiness part 1 #647)
csharp WriteTo(string table) (Adding support for DataFrameWriterV2 #677)

DataFrameStatFunctions

csharp public DataFrame SampleBy(Column column, IDictionary<T, double> fractions, long seed) (Spark 3.0 readiness part 1 #647)

DataFrameWriterV2

RelationalGroupedDataset

scala ~~def as[K: Encoder, T: Encoder]: KeyValueGroupedDataset[K, T]~~

Functions

The text was updated successfully, but these errors were encountered:

rrekapalli · 2020-08-19T07:32:15Z

Is DataSet API & Full MLLib Implementation also going to be part of this version?

imback82 · 2020-08-19T19:37:08Z

@suhsteve Thanks for compiling this. Can you also compile the list from Scala as well?

imback82 · 2020-08-19T19:50:39Z

@rrekapalli Dataset is not supported due to this: #103 (comment)

We will not have a full MLLib as a part of 1.0.0, but we will track it separately in #381.

rapoth · 2020-08-19T23:00:00Z

CC: @GoEddie on the MLLib question in case he has more to add.

GoEddie · 2020-08-19T23:05:12Z

@rrekapalli I have been implementing the newer ML API (some internal classes are still MLLib), if you would like to contribute I am happy to help you with any pr’s.

#381 is for ML.Features and then there is more as well after features.

suhsteve · 2020-08-20T04:43:50Z

@imback82 added the list of scala apis as well.

rrekapalli · 2020-08-20T07:51:56Z

@GoEddie , I am new to Spark and just trying out this library for one of my use case. However, I can definitely give it a try from September 2nd week.

Niharikadutta · 2020-08-29T02:15:58Z

I'm thinking of dividing this into 3 PRs:

For SparkSession, DataFrame and DataFrameStatFunctions APIs
For DataFrameWriterV2
For Functions APIs

Niharikadutta · 2020-10-02T06:47:51Z

All features in this issue have been merged, this can be closed.

suhsteve added the enhancement New feature or request label Aug 19, 2020

suhsteve added good first issue Good for newcomers help wanted Extra attention is needed labels Aug 19, 2020

suhsteve added this to the 1.0.0 milestone Aug 19, 2020

rapoth pinned this issue Aug 19, 2020

This was referenced Sep 1, 2020

Spark 3.0 readiness part 1 #647

Merged

Spark 3.0 readiness part 2 #649

Merged

GoEddie mentioned this issue Sep 3, 2020

xxhash64 - function for Spark 3.0 Readiness #633 #650

Closed

Niharikadutta mentioned this issue Sep 20, 2020

Adding support for DataFrameWriterV2 #677

Merged

dbeavon mentioned this issue Sep 22, 2020

Question about downstream client usage of this project #680

Closed

suhsteve mentioned this issue Sep 25, 2020

Expose DataFrame.Transform #688

Merged

suhsteve closed this as completed Oct 2, 2020

Niharikadutta mentioned this issue Oct 3, 2020

Full support for multithreaded applications #641

Merged

imback82 unpinned this issue Mar 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST]: Spark 3.0 Readiness #633

[FEATURE REQUEST]: Spark 3.0 Readiness #633

suhsteve commented Aug 19, 2020 •

edited

Loading

rrekapalli commented Aug 19, 2020 •

edited

Loading

imback82 commented Aug 19, 2020

imback82 commented Aug 19, 2020

rapoth commented Aug 19, 2020

GoEddie commented Aug 19, 2020

suhsteve commented Aug 20, 2020

rrekapalli commented Aug 20, 2020

Niharikadutta commented Aug 29, 2020

Niharikadutta commented Oct 2, 2020

[FEATURE REQUEST]: Spark 3.0 Readiness #633

[FEATURE REQUEST]: Spark 3.0 Readiness #633

Comments

suhsteve commented Aug 19, 2020 • edited Loading

APIs

SparkSession

DataFrame

DataFrameStatFunctions

DataFrameWriterV2

RelationalGroupedDataset

Functions

rrekapalli commented Aug 19, 2020 • edited Loading

imback82 commented Aug 19, 2020

imback82 commented Aug 19, 2020

rapoth commented Aug 19, 2020

GoEddie commented Aug 19, 2020

suhsteve commented Aug 20, 2020

rrekapalli commented Aug 20, 2020

Niharikadutta commented Aug 29, 2020

Niharikadutta commented Oct 2, 2020

suhsteve commented Aug 19, 2020 •

edited

Loading

rrekapalli commented Aug 19, 2020 •

edited

Loading