Skip to content

Update performance #282

Open
Open
@Jolanrensen

Description

@Jolanrensen

Looking into the implementation of Update, it seems to me that the filtering of rows happens again and again for each column while this could be a one-time operation.

As a broad overview, the current situation looks like:

fun updateImpl() {
    df.replace(columns).with { it: DataColumn -> /* filter, given entire df row and replace column */ }
}

while it could be

fun updateImpl() {
    df.filter { /* Filter rows */ }.replace(columns).with { it: DataColumn -> /* replace column */ }
}

There is of course also a trade-off, because this doesn't work with Update.perCol {}, since it would supply a filtered column instead of the entire column like now. So we need to think this through thoroughly.

Metadata

Metadata

Assignees

Labels

performanceSomething related to how fast the library can handle data

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions