-
Notifications
You must be signed in to change notification settings - Fork 326
[FEATURE REQUEST]: Spark 3.0 Readiness #633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is DataSet API & Full MLLib Implementation also going to be part of this version? |
@suhsteve Thanks for compiling this. Can you also compile the list from Scala as well? |
@rrekapalli We will not have a full MLLib as a part of 1.0.0, but we will track it separately in #381. |
CC: @GoEddie on the MLLib question in case he has more to add. |
@rrekapalli I have been implementing the newer ML API (some internal classes are still MLLib), if you would like to contribute I am happy to help you with any pr’s. #381 is for ML.Features and then there is more as well after features. |
@imback82 added the list of scala apis as well. |
@GoEddie , I am new to Spark and just trying out this library for one of my use case. However, I can definitely give it a try from September 2nd week. |
I'm thinking of dividing this into 3 PRs:
|
All features in this issue have been merged, this can be closed. |
APIs
SparkSession
csharp
public static void SetActiveSession(SparkSession session) (Full support for multithreaded applications #641)csharp
public static void ClearActiveSession() (Full support for multithreaded applications #641)csharp
public static SparkSession GetActiveSession() (Full support for multithreaded applications #641)csharp
public DataFrame ExecuteCommand(string runner, string command, Dictionary<string, string> options) (Spark 3.0 readiness part 1 #647)DataFrame
csharp
public DataFrame Transform(Func<DataFrame, DataFrame> func) (Expose DataFrame.Transform #688)csharp
public IEnumerable Tail(int n) (Spark 3.0 readiness part 1 #647)csharp
public void PrintSchema(int level) (Spark 3.0 readiness part 1 #647)csharp
public void Explain(string mode) (Spark 3.0 readiness part 1 #647)csharp
public DataFrame Observe(string name, Column expr, params Column[] exprs) (Spark 3.0 readiness part 1 #647)csharp
WriteTo(string table) (Adding support for DataFrameWriterV2 #677)DataFrameStatFunctions
csharp
public DataFrame SampleBy(Column column, IDictionary<T, double> fractions, long seed) (Spark 3.0 readiness part 1 #647)DataFrameWriterV2
csharp
public DataFrameWriterV2 Using(string provider) (Adding support for DataFrameWriterV2 #677)csharp
public DataFrameWriterV2 Option(string key, string value) (Adding support for DataFrameWriterV2 #677)csharp
public DataFrameWriterV2 Option(string key, bool value) (Adding support for DataFrameWriterV2 #677)csharp
public DataFrameWriterV2 Option(string key, long value) (Adding support for DataFrameWriterV2 #677)csharp
public DataFrameWriterV2 Option(string key, double value) (Adding support for DataFrameWriterV2 #677)csharp
public DataFrameWriterV2 Options(Dictionary<string, string> options) (Adding support for DataFrameWriterV2 #677)csharp
public DataFrameWriterV2 TableProperty(string property, string value) (Adding support for DataFrameWriterV2 #677)csharp
public DataFrameWriterV2 PartitionedBy(Column column, params Column[] columns) (Adding support for DataFrameWriterV2 #677)csharp
public void Create() (Adding support for DataFrameWriterV2 #677)csharp
public void Replace() (Adding support for DataFrameWriterV2 #677)csharp
public void CreateOrReplace() (Adding support for DataFrameWriterV2 #677)csharp
public void Append() (Adding support for DataFrameWriterV2 #677)csharp
public void Overwrite(Column condition) (Adding support for DataFrameWriterV2 #677)csharp
public void OverwritePartitions() (Adding support for DataFrameWriterV2 #677)RelationalGroupedDataset
scala
def as[K: Encoder, T: Encoder]: KeyValueGroupedDataset[K, T]Functions
csharp
public static Column XXHash64(params Column[] columns) (Spark 3.0 readiness part 2 #649)csharp
public static Column Split(Column column, string pattern, int limit) (Spark 3.0 readiness part 2 #649)csharp
public static Column Overlay(Column src, Column replace, Column pos, Column len) (Spark 3.0 readiness part 2 #649)csharp
public static Column Overlay(Column src, Column replace, Column pos) (Spark 3.0 readiness part 2 #649)csharp
public static Column AddMonths(Column startDate, Column numMonths) (Spark 3.0 readiness part 2 #649)csharp
public static Column DateAdd(Column start, Column days) (Spark 3.0 readiness part 2 #649)csharp
public static Column DateSub(Column start, Column days) (Spark 3.0 readiness part 2 #649)scala
def transform(column: Column, f: Column => Column): ColumnUnsupported: passing function as parameterscala
def transform(column: Column, f: (Column, Column) => Column): ColumnUnsupported: passing function as parameterscala
def exists(column: Column, f: Column => Column): ColumnUnsupported: passing function as parameterscala
def forall(column: Column, f: Column => Column): ColumnUnsupported: passing function as parameterscala
def filter(column: Column, f: Column => Column): ColumnUnsupported: passing function as parameterscala
def filter(column: Column, f: (Column, Column) => Column): ColumnUnsupported: passing function as parameterscala
def aggregate(expr: Column, initialValue: Column, merge: (Column, Column) => Column, finish: Column => Column): ColumnUnsupported: passing function as parameterscala
def aggregate(expr: Column, initialValue: Column, merge: (Column, Column) => Column): ColumnUnsupported: passing function as parameterscala
def zip_with(left: Column, right: Column, f: (Column, Column) => Column): ColumnUnsupported: passing function as parameterscala
def transform_keys(expr: Column, f: (Column, Column) => Column): ColumnUnsupported: passing function as parameterscala
def transform_values(expr: Column, f: (Column, Column) => Column): ColumnUnsupported: passing function as parameterscala
def map_filter(expr: Column, f: (Column, Column) => Column): ColumnUnsupported: passing function as parameterscala
def map_zip_with(left: Column, right: Column, f: (Column, Column, Column) => Column): ColumnUnsupported: passing function as parametercsharp
public static Column SchemaOfJson(Column json, Dictionary<string, string> options) (Spark 3.0 readiness part 2 #649)csharp
public static Column MapEntries(Column column) (Spark 3.0 readiness part 2 #649)csharp
public static Column FromCsv(Column column, StructType schema, Dictionary<string, string> options) (Spark 3.0 readiness part 2 #649)csharp
public static Column FromCsv(Column column, Column schema, Dictionary<string, string> options) (Spark 3.0 readiness part 2 #649)csharp
public static Column SchemaOfCsv(string csv) (Spark 3.0 readiness part 2 #649)csharp
public static Column SchemaOfCsv(Column csv) (Spark 3.0 readiness part 2 #649)csharp
public static Column SchemaOfCsv(Column csv, Dictionary<string, string> options) (Spark 3.0 readiness part 2 #649)csharp
public static Column ToCsv(Column column, Dictionary<string, string> options) (Spark 3.0 readiness part 2 #649)csharp
public static Column ToCsv(Column column) (Spark 3.0 readiness part 2 #649)csharp
public static Column Years(Column column) (Spark 3.0 readiness part 2 #649)csharp
public static Column Months(Column column) (Spark 3.0 readiness part 2 #649)csharp
public static Column Days(Column column) (Spark 3.0 readiness part 2 #649)csharp
public static Column Hours(Column column) (Spark 3.0 readiness part 2 #649)csharp
public static Column Bucket(Column numBuckets, Column column) (Spark 3.0 readiness part 2 #649)csharp
public static Column Bucket(int numBuckets, Column column) (Spark 3.0 readiness part 2 #649)The text was updated successfully, but these errors were encountered: