Determine memory requirements for compressed inputs

## New feature

Add **capability to access decompressed file** (at least attributes like actual - decompressed - size in memory) on the **Path** object.

## Use case 

More often than not, the memory space required for an algorithm (through its implementation) is a **fixed multiplier of the memory space taken by the input data**, plus a small allocation for the code itself. From experience, that info provides very effective heuristics for **pinning memory** for a given process, considering **n**-fold parallel memory allocations by its subprocesses. That's how I'd like to define **memory** directive for processes, dynamically using closures.

However, this heuristic cannot be deployed **if the input data is compressed**, as the _Path_ object doesn't expose actual byte length of the data. I agree it's not fun to decompress before the processing itself (full of edge cases with the cloud, data in remote centers and everything), but if possible, that information should be surfaced.

## Suggested implementation 

I have none, this is an open question. Maybe Nextflow input channels **should support all popular compression**, so it can introspect them all, or **define a subset of acceptable compression algorithms**. Or just add the handles when the underlying Java implementation can do it. I don't know, but this an evergrowing limitation in pipeline definitions on my end. As of now, I cannot tell my users how much RAM they need. The gist is giving as much as they can on any failing job, which is a really bad habit on computing clusters and the cloud.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Determine memory requirements for compressed inputs #6899

New feature

Use case

Suggested implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Determine memory requirements for compressed inputs #6899

Description

New feature

Use case

Suggested implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions