Skip to content

Conversation

karthikeyann
Copy link

@karthikeyann
Copy link
Author

/ok to test 17269f3

copy-pr-bot bot pushed a commit that referenced this pull request Sep 10, 2025
Bikramjeet Vig and others added 18 commits September 12, 2025 11:01
Summary:
Pull Request resolved: facebookincubator#14671

Added MakeRowFromMap utility class to project specified keys from a
vector of MAP type into a RowVector with named fields. Takes a list
of keys to extract (keysToProject) and corresponding output field
names (outputFieldNames), with options to replace nulls with
type-specific defaults, allow top-level nulls in the output RowVector,
and control duplicate key handling. Optinally accepts exec::EvalCtx
for use in Vector Functions employ expression evaluation specific
behavior like per-row error handling. Currently only supports
SMALLINT, INTEGER, and BIGINT keys.

bypass-github-export-checks

Reviewed By: mbasmanova

Differential Revision: D81465416

fbshipit-source-id: be1189f0d0d75a008b4948d4718fa910daa3fe85
… compatibility (facebookincubator#14798)

Summary:
$USERNAME is not the standard variable. It is empty on MacOS.

Pull Request resolved: facebookincubator#14798

Reviewed By: kagamiori

Differential Revision: D82029919

Pulled By: Yuhta

fbshipit-source-id: aa2577215a6f54d50c484324b96f065be7fada07
…cubator#14780)

Summary:
The 'buffers' variable is unused, so remove it.

Pull Request resolved: facebookincubator#14780

Reviewed By: kagamiori

Differential Revision: D82030707

Pulled By: Yuhta

fbshipit-source-id: 85ba4920ae394971b015cca800e8f23586871f60
Summary:
Pull Request resolved: facebookincubator#14841

feat: Add counter for nimble and dwrf writer

Reviewed By: tanjialiang

Differential Revision: D82275723

fbshipit-source-id: 1608fe29f40904be5fc1e7286702bc8ea7b6b681
…#10138)

Summary:
Support Spark `array_sort` to allow sorting elements with `lambda` function.
Since Spark has different comparison implementation with Presto, Presto's
`array_sort` implementation is refactored for Spark to rewrite `lambda`
function as a simple comparator if possible.

This pr tries to:
1. Move Presto `array_sort` to `velox/functions/lib`.
2. Add a new option `nullsFirst` to support nulls to be placed at the start of the
array (to support Spark function `sort_array`).
4. Extract the common logic of `SimpleComparisonMatcher` and move it to
`velox/functions/lib`, and create different `SimpleComparisonChecker` for Spark
and Presto for the comparison match( e.g, `=` is `eq` in Presto, but `equalto` in Spark).
6. Add tests to cover Spark rewrite function logic.

Pull Request resolved: facebookincubator#10138

Reviewed By: Yuhta

Differential Revision: D71047836

Pulled By: kagamiori

fbshipit-source-id: 80ff84670985f4000308f509841656f388702cc5
…ncubator#14676)

Summary: Pull Request resolved: facebookincubator#14676

Reviewed By: kagamiori

Differential Revision: D82030005

Pulled By: Yuhta

fbshipit-source-id: f656bf304e16856916d77e89b942647d8b0820db
Summary:
Fixes the below compilation error.

```
velox/experimental/cudf/connectors/parquet/ParquetDataSink.cpp:145:7: error: 'commitStrategyToString' was not declared in this scope; did you mean 'commitStrategy_'?
  145 |       commitStrategyToString(commitStrategy_));
      |       ^~~~~~~~~~~~~~~~~~~~~~
```

Follow-up for facebookincubator@99fe06a.

Pull Request resolved: facebookincubator#14799

Reviewed By: kagamiori

Differential Revision: D82029874

Pulled By: Yuhta

fbshipit-source-id: ea537a07e8d2625ebc0b5fddea2dea8b0b098de8
…kincubator#14848)

Summary:
X-link: facebookincubator/axiom#394

Pull Request resolved: facebookincubator#14848

Continuation of facebookincubator#14784

bypass-github-export-checks

Reviewed By: Yuhta

Differential Revision: D82289830

fbshipit-source-id: b12e1042d412c0bc992655665b4363bb614398b9
…facebookincubator#14843)

Summary:
Fixes facebookincubator#14842

This failed static assertion with libstdc++ 15. See also the error log in the associated issue.

Pull Request resolved: facebookincubator#14843

Reviewed By: xiaoxmeng

Differential Revision: D82329227

Pulled By: kagamiori

fbshipit-source-id: 6d85d572bc8564d0e4bc319b569c23851f595deb
…ebookincubator#14825)

Summary:
Pull Request resolved: facebookincubator#14825

Current implementation of the AlignedBuffer::allocate does not allocate the exact size by default.
For some cases, when the buffer size is known beforehand, this leads to significant memory
overconsumption because the buffer allocates the best size suggested by the MemoryPool::getPreferredSize.

To avoid that overallocation a new allocateExact parameter was recently added to the AlignedBuffer::allocate.
However, usage of this parameter is a bit clanky. To make the API call more verbose I introduce a new helper
function AlignedBuffer::allocateExact, that is simply a verbose wrapper around AlignedBuffer::allocate.

For reference, here are current ranges produced by MemoryPool::getPreferredSize:
```
               1 -            8 =            8
               9 -           12 =           12
              13 -           16 =           16
              17 -           24 =           24
              25 -           32 =           32
              33 -           48 =           48
              49 -           64 =           64
              65 -           96 =           96
              97 -          128 =          128
             129 -          192 =          192
             193 -          256 =          256
             257 -          384 =          384
             385 -          512 =          512
             513 -          768 =          768
             769 -        1,024 =        1,024
           1,025 -        1,536 =        1,536
           1,537 -        2,048 =        2,048
           2,049 -        3,072 =        3,072
           3,073 -        4,096 =        4,096
           4,097 -        6,144 =        6,144
           6,145 -        8,192 =        8,192
           8,193 -       12,288 =       12,288
          12,289 -       16,384 =       16,384
          16,385 -       24,576 =       24,576
          24,577 -       32,768 =       32,768
          32,769 -       49,152 =       49,152
          49,153 -       65,536 =       65,536
          65,537 -       98,304 =       98,304
          98,305 -      131,072 =      131,072
         131,073 -      196,608 =      196,608
         196,609 -      262,144 =      262,144
         262,145 -      393,216 =      393,216
         393,217 -      524,288 =      524,288
         524,289 -      786,432 =      786,432
         786,433 -    1,048,576 =    1,048,576
       1,048,577 -    1,572,864 =    1,572,864
       1,572,865 -    2,097,152 =    2,097,152
       2,097,153 -    3,145,728 =    3,145,728
       3,145,729 -    4,194,304 =    4,194,304
       4,194,305 -    6,291,456 =    6,291,456
       6,291,457 -    8,388,608 =    8,388,608
       8,388,609 -   12,582,912 =   12,582,912
      12,582,913 -   16,777,216 =   16,777,216
      16,777,217 -   25,165,824 =   25,165,824
      25,165,825 -   33,554,432 =   33,554,432
      33,554,433 -   50,331,648 =   50,331,648
      50,331,649 -   67,108,864 =   67,108,864
      67,108,865 -  100,663,296 =  100,663,296
     100,663,297 -  134,217,728 =  134,217,728
     134,217,729 -  201,326,592 =  201,326,592
     201,326,593 -  268,435,456 =  268,435,456
     268,435,457 -  402,653,184 =  402,653,184
     402,653,185 -  536,870,912 =  536,870,912
     536,870,913 -  805,306,368 =  805,306,368
     805,306,369 - 1,073,741,824 = 1,073,741,824
   1,073,741,825 - 1,610,612,736 = 1,610,612,736
   1,610,612,737 - 2,147,483,648 = 2,147,483,648
   2,147,483,649 - 3,221,225,472 = 3,221,225,472
   3,221,225,473 - 4,294,967,296 = 4,294,967,296
```

Reviewed By: Yuhta

Differential Revision: D82167134

fbshipit-source-id: 4733be75c39ca90a1fead0faa0c44ec12b80bae9
… join conditions (facebookincubator#14837)

Summary:
Pull Request resolved: facebookincubator#14837

Add filter support in index join for filter which can't be converted into join conditions that can be pushdown to index source. The filter is executed on the lookup result before left join processing. This is to enable Meta AI data exploration query shapes

The followup is to consider use join match tracker inside index lookup to handle this logic to be consistent with other join type implementations.

Reviewed By: zacw7

Differential Revision: D82149399

fbshipit-source-id: efa942e1d8c58f08dbe51d955f91858367bdfe6c
…incubator#14658)

Summary:
Expression rewrites are currently defined in `VectorFunction.cpp`, rewrites are registered only for scalar functions, and rewrites are applied during expression compilation. In the upcoming `ExpressionOptimizer` work (facebookincubator#14523), we intend to build on the existing `ExpressionRewrite` support and introduce more expression rewrites for special form expressions.

This refactor formalizes the existing `ExpressionRewrite` framework so it can be expanded upon in the `ExpressionOptimizer`.

Pull Request resolved: facebookincubator#14658

Reviewed By: mbasmanova

Differential Revision: D82252841

Pulled By: kagamiori

fbshipit-source-id: 0f9007c3ae9e4dbdd58cefe7397874998841c295
Summary:
Pull Request resolved: facebookincubator#14836

This change adds support for TIME type in Velox. Support for casting TIME along with support in simple function interface and basic UDF's supporting TIME will come in subsequent diffs.

See Issue: facebookincubator#14633

Reviewed By: kevinwilfong

Differential Revision: D81811610

fbshipit-source-id: ceb6c324c7ce4dc5bf761e77647283fbe72a4104
Summary:
Register cudf in executor, and may disable cudf in task plan level. For several plans, if the plan cannot fully be executed in GPU, we will not offload this stage to GPU to avoid format conversion cost, so we need this config to disable CUDF driver adapter.

Pull Request resolved: facebookincubator#14216

Reviewed By: Yuhta

Differential Revision: D82322695

Pulled By: kagamiori

fbshipit-source-id: 110e209a38a072ec6170e80fa26b804205ce7c42
Summary:
With the C++20 change using volatile is an error.
There was one benchmark that did use this. but it didn’t come up in the CI which didn’t seem to build the benchmarks at all. As a result the CI needed a fix to ensure proper compilation.

Pull Request resolved: facebookincubator#14552

Reviewed By: bikramSingh91

Differential Revision: D82456701

Pulled By: kgpai

fbshipit-source-id: e3c68d3c8be2aaf200390cba993a3ebadb4d3408
…ncubator#14863)

Summary:
Pull Request resolved: facebookincubator#14863

These constants will be used in spatial joins, which won't actually
need the full power of GEOS.  Extracting these means we can keep the join code
simpler and agnostic to the join filter, and avoid conditional compilation flags
for GEOS.

In the future we can extract more constants to GeometryConstants if desired.

Reviewed By: bikramSingh91

Differential Revision: D82451199

fbshipit-source-id: 907a19f9d97573a663ab6a07b2a6c9963c83797c
…cebookincubator#14767)

Summary:
Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.12.4 to 1.13.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/pypa/gh-action-pypi-publish/releases">pypa/gh-action-pypi-publish's releases</a>.</em></p>
<blockquote>
<h2>v1.13.0</h2>

<blockquote>
<p>[!important]
🚨 This release includes fixes for <a href="https://github.com/pypa/gh-action-pypi-publish/security/advisories/GHSA-vxmw-7h4f-hqxh">GHSA-vxmw-7h4f-hqxh</a> discovered by <a href="https://github.com/woodruffw"><code>@​woodruffw</code></a><a href="https://github.com/sponsors/woodruffw">💰</a>.
We've also integrated <a href="http://zizmor.sh">Zizmor</a> to catch similar issues in the future and you should too.</p>
</blockquote>
<h2>✨ New Stuff</h2>
<p><a href="https://github.com/woodruffw"><code>@​woodruffw</code></a><a href="https://github.com/sponsors/woodruffw">💰</a> updated the README to no longer mention the attestations feature being experimental in <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/347">https://github.com/facebookincubator/velox/issues/347</a>: it's been rather stable for a year already 🎉
He also added more diagnostic output which includes printing out the GitHub Environment claim via <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/371">https://github.com/facebookincubator/velox/issues/371</a> and warning about the unsupported reusable workflows configurations <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/306">https://github.com/facebookincubator/velox/issues/306</a>, when using Trusted Publishing.</p>
<blockquote>
<p>[!tip]
The official support for reusable workflows is currently blocked on changes to PyPI. To get updates about progress on the action side, you may want to subscribe to <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/166">https://github.com/facebookincubator/velox/issues/166</a>.
At PyCon US 2025 Sprints, <a href="https://github.com/facutuesca"><code>@​facutuesca</code></a><a href="https://github.com/sponsors/facutuesca">💰</a>, <a href="https://github.com/miketheman"><code>@​miketheman</code></a><a href="https://github.com/sponsors/miketheman">💰</a>, <a href="https://github.com/woodruffw"><code>@​woodruffw</code></a><a href="https://github.com/sponsors/woodruffw">💰</a> and I<a href="https://github.com/sponsors/webknjaz">💰</a> spent several hours IRL brainstorming how to fix this and migrate projects that happen to rely on an obscure corner case with reusable workflows that temporarily allows them to function by accident.
The result of that discussion is posted @ <a href="https://redirect.github.com/pypi/warehouse/issues/11096#issuecomment-2895081700">pypi/warehouse#11096</a>.
Note that this is a volunteer-led effort and there is no ETA. If you need this soon, make your employer sponsor the PSF and maybe they'll be able to hire somebody for this work on Warehouse.</p>
</blockquote>
<p>In addition to that, <a href="https://github.com/konstin"><code>@​konstin</code></a><a href="https://github.com/sponsors/konstin">💰</a> sent <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/378">https://github.com/facebookincubator/velox/issues/378</a> to pin <code>actions/setup-python</code> to a SHA hash. This makes <code>pypi-publish</code> compatible with new GitHub policies that allow organizations to mandate hash-pinning actions used in workflows.</p>
<h2>🛠️ Internal Dependencies</h2>
<p><a href="https://github.com/webknjaz"><code>@​webknjaz</code></a><a href="https://github.com/sponsors/webknjaz">💰</a> made a bunch of updates to the action runtime which includes bumping it to Python 3.13 in <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/331">https://github.com/facebookincubator/velox/issues/331</a> and updating the dependency tree across the board. <code>pip-with-requires-python</code> is no longer being installed (<a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/332">https://github.com/facebookincubator/velox/issues/332</a>). Some related bumps were contributed by <a href="https://github.com/woodruffw"><code>@​woodruffw</code></a><a href="https://github.com/sponsors/woodruffw">💰</a> (<a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/359">https://github.com/facebookincubator/velox/issues/359</a>) and <a href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a><a href="https://github.com/sponsors/kurtmckee">💰</a> sent a contributor-facing PR, bumping the linting configuration via <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/335">https://github.com/facebookincubator/velox/issues/335</a>.</p>
<h2>💪 New Contributors</h2>
<ul>
<li><a href="https://github.com/kurtmckee"><code>@​kurtmckee</code></a> made their first contribution in <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/335">https://github.com/facebookincubator/velox/issues/335</a></li>
<li><a href="https://github.com/konstin"><code>@​konstin</code></a> made their first contribution in <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/378">https://github.com/facebookincubator/velox/issues/378</a></li>
</ul>
<p><strong>🪞 Full Diff</strong>: <a href="https://github.com/pypa/gh-action-pypi-publish/compare/v1.12.4...v1.13.0">https://github.com/pypa/gh-action-pypi-publish/compare/v1.12.4...v1.13.0</a></p>
<p><strong>🧔‍♂️ Release Manager:</strong> <a href="https://github.com/sponsors/webknjaz"><code>@​webknjaz</code></a> <a href="https://stand-with-ukraine.pp.ua">🇺🇦</a></p>
<p><strong>💬 Discuss</strong> <a href="https://bsky.app/profile/webknjaz.me/post/3lxxzvzhvfc2e">on Bluesky 🦋</a>, <a href="https://mastodon.social/webknjaz/115143522527224444">on Mastodon 🐘</a> and <a href="https://github.com/pypa/gh-action-pypi-publish/discussions/379">on GitHub</a>.</p>
<p><a href="https://github.com/sponsors/webknjaz"><img src="https://img.shields.io/badge/%40webknjaz-transparent?logo=githubsponsors&amp;logoColor=%23EA4AAA&amp;label=Sponsor&amp;color=2a313c" alt="GH Sponsors badge" /></a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e"><code>ed0c539</code></a> 📦📌 Bump the pinned dependency tree</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/77db1b7cf7dcea2e403bb4350516284282740dd6"><code>77db1b7</code></a> Merge branch PR <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/306">https://github.com/facebookincubator/velox/issues/306</a>, GHSA-vxmw-7h4f-hqxh fix and PR <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/378">https://github.com/facebookincubator/velox/issues/378</a> into unstable/v1</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/280b3a1b7e38a360b85b4ee41645d27b79bde3fc"><code>280b3a1</code></a> Alias <code>typing as t</code> in imports</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/e380240d7e3673f460e0621686f33fbbf9594e85"><code>e380240</code></a> Use <code>object</code> in place of <code>typing.Any</code> in annotations</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/e50bff6eb477e46de0cbacc0693737ecb690eb0f"><code>e50bff6</code></a> Deduplicate claim ref lookup</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/decbc9a5d448364aa64c211724dc79a2cefcab2a"><code>decbc9a</code></a> Hint people to subscribe to <a href="https://redirect.github.com/pypa/gh-action-pypi-publish/issues/166">https://github.com/facebookincubator/velox/issues/166</a> for notifications</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/8208ad36a18e6fdd644f6ad69dc70c833d8af633"><code>8208ad3</code></a> Ask not to report bugs with reusable workflow</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/ff0fef5bdb66aa250f741d5d8b00a8b78b9dffd5"><code>ff0fef5</code></a> 🧪 Scope WPS202 suppression to specific files</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/1293b8c325b5f9abcab5160ee3553de2ee6a883f"><code>1293b8c</code></a> Use yamllint disable line length lint</li>
<li><a href="https://github.com/pypa/gh-action-pypi-publish/commit/ed01280d14b6f9a0edaa1a5494d8f7ffed709083"><code>ed01280</code></a> Linter (different rule)</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/gh-action-pypi-publish/compare/76f52bc884231f62b9a034ebfe128415bbaabdfc...ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pypa/gh-action-pypi-publish&package-manager=github_actions&previous-version=1.12.4&new-version=1.13.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

 ---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `dependabot rebase` will rebase this PR
- `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `dependabot merge` will merge this PR after your CI passes on it
- `dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `dependabot cancel merge` will cancel a previously requested merge and block automerging
- `dependabot reopen` will reopen this PR if it is closed
- `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Pull Request resolved: facebookincubator#14767

Reviewed By: kKPulla

Differential Revision: D82466593

Pulled By: kagamiori

fbshipit-source-id: 42baeea5b5d19bd6384d688e1ba9e9e2050ba3b4
@karthikeyann karthikeyann requested a review from a team as a code owner September 15, 2025 22:05
Copy link

copy-pr-bot bot commented Sep 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

czentgr and others added 5 commits September 15, 2025 15:12
Summary:
The sed program usage only works on Linux and needs a fix for macOS. This is the same fix that was applied for libstemmer.

Pull Request resolved: facebookincubator#14852

Reviewed By: kKPulla

Differential Revision: D82466401

Pulled By: kagamiori

fbshipit-source-id: 12d66f3a5b48b4df5b59d879c9022c88cdb5660f
Summary:
Pull Request resolved: facebookincubator#14854

Cache row size estimates so that callers can call it multiple times without worrying too much about the cost. It is a prereq diff for having a more dynamic row size estimate in case of missing file stats.

Reviewed By: tanjialiang

Differential Revision: D81762324

fbshipit-source-id: a63d9c8b18634c1f7b89b0bd7f70f6a53d3722b7
…ebookincubator#14855)

Summary:
Pull Request resolved: facebookincubator#14855

X-link: facebookincubator/nimble#250

Original diff: D80310282

Add a framework to complement the row size estimate heuristics, based on the retained vector sizes.

Currently this framework is used as a stop gap solution to still have functional row estimates when column stats are missing, and decoders couldn't provide a relatively cheap estimate. The current functionality gap in decoder row estimates caused various queries to run with super small batches (frequently just 10 rows), and vastly slowing down the downstream eval.

NOTE: this diff fixes an accounting issue for arrays and maps, which was causing query OOMs.

Reviewed By: tanjialiang

Differential Revision: D81762328

fbshipit-source-id: 4238e5e45632323b577fc1bc58b6ebfcf9033dfc
…r#14857)

Summary:
Pull Request resolved: facebookincubator#14857

X-link: facebookincubator/nimble#251

Add a kill switch for row size tracking in case it has unexpected overhead for some data shapes. (Low concern IMO because the row size tracking would quickly increase the batch size and reduce its own overhead. If the end state batch size is still small, we should either way tune the batch memory budget.)

The session property wire up would be added in a presto PR separately.

NOTE: this diff also disables row size tracking for metalake reads by default.

Reviewed By: tanjialiang

Differential Revision: D81762323

fbshipit-source-id: 56b62f4d4d70f779576177f9d40d72db419c7809
…cubator#14768)

Summary:
Bumps [actions/github-script](https://github.com/actions/github-script) from 7.0.1 to 8.0.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/actions/github-script/releases">actions/github-script's releases</a>.</em></p>
<blockquote>
<h2>v8.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update Node.js version support to 24.x by <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/637">actions/github-script#637</a></li>
<li>README for updating actions/github-script from v7 to v8 by <a href="https://github.com/sneha-krip"><code>@​sneha-krip</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/653">actions/github-script#653</a></li>
</ul>
<h2>⚠️ Minimum Compatible Runner Version</h2>
<p><strong>v2.327.1</strong><br />
<a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></p>
<p>Make sure your runner is updated to this version or newer to use this release.</p>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/637">actions/github-script#637</a></li>
<li><a href="https://github.com/sneha-krip"><code>@​sneha-krip</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/653">actions/github-script#653</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/github-script/compare/v7.1.0...v8.0.0">https://github.com/actions/github-script/compare/v7.1.0...v8.0.0</a></p>
<h2>v7.1.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Upgrade husky to v9 by <a href="https://github.com/benelan"><code>@​benelan</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/482">actions/github-script#482</a></li>
<li>Add workflow file for publishing releases to immutable action package by <a href="https://github.com/Jcambass"><code>@​Jcambass</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/485">actions/github-script#485</a></li>
<li>Upgrade IA Publish by <a href="https://github.com/Jcambass"><code>@​Jcambass</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/486">actions/github-script#486</a></li>
<li>Fix workflow status badges by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/497">actions/github-script#497</a></li>
<li>Update usage of <code>actions/upload-artifact</code> by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/512">actions/github-script#512</a></li>
<li>Clear up package name confusion by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/514">actions/github-script#514</a></li>
<li>Update dependencies with <code>npm audit fix</code> by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/515">actions/github-script#515</a></li>
<li>Specify that the used script is JavaScript by <a href="https://github.com/timotk"><code>@​timotk</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/478">actions/github-script#478</a></li>
<li>chore: Add Dependabot for NPM and Actions by <a href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/472">actions/github-script#472</a></li>
<li>Define <code>permissions</code> in workflows and update actions by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/531">actions/github-script#531</a></li>
<li>chore: Add Dependabot for .github/actions/install-dependencies by <a href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/532">actions/github-script#532</a></li>
<li>chore: Remove .vscode settings by <a href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/533">actions/github-script#533</a></li>
<li>ci: Use github/setup-licensed by <a href="https://github.com/nschonni"><code>@​nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/473">actions/github-script#473</a></li>
<li>make octokit instance available as octokit on top of github, to make it easier to seamlessly copy examples from GitHub rest api or octokit documentations by <a href="https://github.com/iamstarkov"><code>@​iamstarkov</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/508">actions/github-script#508</a></li>
<li>Remove <code>octokit</code> README updates for v7 by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/557">actions/github-script#557</a></li>
<li>docs: add &quot;exec&quot; usage examples by <a href="https://github.com/neilime"><code>@​neilime</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/546">actions/github-script#546</a></li>
<li>Bump ruby/setup-ruby from 1.213.0 to 1.222.0 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/github-script/pull/563">actions/github-script#563</a></li>
<li>Bump ruby/setup-ruby from 1.222.0 to 1.229.0 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/github-script/pull/575">actions/github-script#575</a></li>
<li>Clearly document passing inputs to the <code>script</code> by <a href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/603">actions/github-script#603</a></li>
<li>Update README.md by <a href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/610">actions/github-script#610</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/benelan"><code>@​benelan</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/482">actions/github-script#482</a></li>
<li><a href="https://github.com/Jcambass"><code>@​Jcambass</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/485">actions/github-script#485</a></li>
<li><a href="https://github.com/timotk"><code>@​timotk</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/478">actions/github-script#478</a></li>
<li><a href="https://github.com/iamstarkov"><code>@​iamstarkov</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/508">actions/github-script#508</a></li>
<li><a href="https://github.com/neilime"><code>@​neilime</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/546">actions/github-script#546</a></li>
<li><a href="https://github.com/nebuk89"><code>@​nebuk89</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/610">actions/github-script#610</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/github-script/compare/v7...v7.1.0">https://github.com/actions/github-script/compare/v7...v7.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/actions/github-script/commit/ed597411d8f924073f98dfc5c65a23a2325f34cd"><code>ed59741</code></a> Merge pull request <a href="https://redirect.github.com/actions/github-script/issues/653">https://github.com/facebookincubator/velox/issues/653</a> from actions/sneha-krip/readme-for-v8</li>
<li><a href="https://github.com/actions/github-script/commit/2dc352e4baefd91bec0d06f6ae2f1045d1687ca3"><code>2dc352e</code></a> Bold minimum Actions Runner version in README</li>
<li><a href="https://github.com/actions/github-script/commit/01e118c8d0d22115597e46514b5794e7bc3d56f1"><code>01e118c</code></a> Update README for Node 24 runtime requirements</li>
<li><a href="https://github.com/actions/github-script/commit/8b222ac82eda86dcad7795c9d49b839f7bf5b18b"><code>8b222ac</code></a> Apply suggestion from <a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a></li>
<li><a href="https://github.com/actions/github-script/commit/adc0eeac992408a7b276994ca87edde1c8ce4d25"><code>adc0eea</code></a> README for updating actions/github-script from v7 to v8</li>
<li><a href="https://github.com/actions/github-script/commit/20fe497b3fe0c7be8aae5c9df711ac716dc9c425"><code>20fe497</code></a> Merge pull request <a href="https://redirect.github.com/actions/github-script/issues/637">https://github.com/facebookincubator/velox/issues/637</a> from actions/node24</li>
<li><a href="https://github.com/actions/github-script/commit/e7b7f222b11a03e8b695c4c7afba89a02ea20164"><code>e7b7f22</code></a> update licenses</li>
<li><a href="https://github.com/actions/github-script/commit/2c81ba05f308415d095291e6eeffe983d822345b"><code>2c81ba0</code></a> Update Node.js version support to 24.x</li>
<li><a href="https://github.com/actions/github-script/commit/f28e40c7f34bde8b3046d885e986cb6290c5673b"><code>f28e40c</code></a> Merge pull request <a href="https://redirect.github.com/actions/github-script/issues/610">https://github.com/facebookincubator/velox/issues/610</a> from actions/nebuk89-patch-1</li>
<li><a href="https://github.com/actions/github-script/commit/1ae9958572fde544457e4d51aed5ea044e8936f3"><code>1ae9958</code></a> Update README.md</li>
<li>Additional commits viewable in <a href="https://github.com/actions/github-script/compare/60a0d83039c74a4aee543508d2ffcb1c3799cdea...ed597411d8f924073f98dfc5c65a23a2325f34cd">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/github-script&package-manager=github_actions&previous-version=7.0.1&new-version=8.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

 ---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `dependabot rebase` will rebase this PR
- `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `dependabot merge` will merge this PR after your CI passes on it
- `dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `dependabot cancel merge` will cancel a previously requested merge and block automerging
- `dependabot reopen` will reopen this PR if it is closed
- `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Pull Request resolved: facebookincubator#14768

Reviewed By: kKPulla

Differential Revision: D82466332

Pulled By: kagamiori

fbshipit-source-id: 64f7b59c19ad54e8e1e4c3ab2e21442d7a009b4e
peterenescu and others added 30 commits October 14, 2025 21:38
Summary:
Pull Request resolved: facebookincubator#15038

Adds FlatMapVector support for functions subscript or element_at. This will allow FlatMapVector-encoded maps to be read without conversion and have its values extracted from.

Additionally, this support will greatly increase performance as singular key values can be projected without full-map materialization.

Reviewed By: pedroerp

Differential Revision: D83867312

fbshipit-source-id: 5b31fe4c484516e2157487573f542177fe9a81b6
Summary:
Pull Request resolved: facebookincubator#15069

## Summary

Implemented REMAP_KEYS UDF for Velox following the task requirements in T240490714.

## Function Signature
```
REMAP_KEYS(MAP(K,V), ARRAY[K], ARRAY[K]) -> MAP(K, V)
```

## Key Features

- **Key remapping**: Changes map keys based on old-to-new key mapping arrays
- **Value preservation**: All values remain unchanged, only keys are remapped
- **Null handling**:
  - Null values in maps are preserved
  - Null keys in arrays are ignored
- **Mismatched array lengths**: Uses minimum of both array lengths for mapping
- **Unmapped keys**: Keys not in oldKeys array remain unchanged
- **Optimized implementations**: Three specialized implementations for different types

## Implementation Details

### Core Files Added/Modified:
- `fbcode/velox/functions/prestosql/RemapKeys.h` - Main implementation with 3 optimized variants
- `fbcode/velox/functions/prestosql/registration/MapFunctionsRegistration.cpp` - Function registration
- `fbcode/velox/functions/prestosql/BUCK` - Build configuration
- `fbcode/velox/functions/prestosql/tests/RemapKeysTest.cpp` - Comprehensive test suite
- `fbcode/velox/functions/prestosql/tests/CMakeLists.txt` - Test build configuration
- `fbcode/velox/expression/fuzzer/ExpressionFuzzerTest.cpp` - Fuzzer exclusion

### Implementation Variants:
1. **RemapKeysPrimitiveFunction**: Optimized for primitive types with hash map lookup
2. **RemapKeysVarcharFunction**: String-optimized with zero-copy semantics
3. **RemapKeysFunction**: Generic implementation for complex types

### Behavior Examples:
```sql
SELECT remap_keys(MAP(ARRAY[1, 2, 3], ARRAY[10, 20, 30]), ARRAY[1, 3], ARRAY[100, 300]);
-- MAP(ARRAY[100, 2, 300], ARRAY[10, 20, 30])

SELECT remap_keys(MAP(ARRAY['a', 'b'], ARRAY[1, 2]), ARRAY['a'], ARRAY['alpha']);
-- MAP(ARRAY['alpha', 'b'], ARRAY[1, 2])

SELECT remap_keys(MAP(ARRAY[1, 2], ARRAY[10, null]), ARRAY[1], ARRAY[100]);
-- MAP(ARRAY[100, 2], ARRAY[10, null])
```

## Testing

Comprehensive test coverage including:
- Basic functionality with various data types (int, float, bool, string, timestamp, complex)
- Edge cases (empty maps, empty arrays, no matching keys)
- Null handling (nulls in values, nulls in key arrays)
- Mismatched array lengths
- Duplicate old keys (last occurrence wins)
- Partial and complete key remapping

## Compatibility

- Follows existing Velox UDF patterns (similar to MAP_SUBSET and ARRAY_SUBSET)
- Maintains backward compatibility with existing map functions
- Velox-only function (excluded from Presto fuzzer tests)

[Session trajectory link](https://www.internalfb.com/intern/devai/devmate/inspector/?id=T240490714-e771fd55-1a8a-411e-acab-2cd2313a8296)

Reviewed By: zacw7

Differential Revision: D83999440

fbshipit-source-id: c52f6a7d6b95a95a368bbe3233b5ac11be3407ae
…ncubator#15087)

Summary:
Remove velox_cudf_hive_config library as it is not required.
cudf::cudf has to be a public dependency on velox_cudf_hive_connector due to the build error below.
Remove other unrelated dependencies.
Add a log message for the constructor.
```
/deepak/presto/presto-native-execution/velox/velox/experimental/cudf/connectors/hive/CudfHiveConfig.h:21:10: fatal error: cudf/types.hpp: No such file or directory
```

Pull Request resolved: facebookincubator#15087

Reviewed By: xiaoxmeng

Differential Revision: D84640709

Pulled By: kKPulla

fbshipit-source-id: 03d7b4ab46f0ce74448db79e2c6b9a7643462f05
…bookincubator#15182)

Summary:
Pull Request resolved: facebookincubator#15182

`-Wunused-exception-parameter` has identified an unused exception parameter. This diff removes it.

This:
```
try {
    ...
} catch (exception& e) {
    // no use of e
}
```
should instead be written as
```
} catch (exception&) {
```

If the code compiles, this is safe to land.

Differential Revision: D84732654

fbshipit-source-id: cb155bb70bb568d7d34d07134ed280ce10088178
Summary:
Implement lock free updates.

For queries that need to skip a lot of not relevant (e.g.: 20251013_203818_00022_jsvmi) stripes accessing cache can
become a bottleneck: https://fburl.com/strobelight/vazgb08u

Reviewed By: Yuhta

Differential Revision: D84661649

fbshipit-source-id: a6ed6383f1255d0cd5f79248a652455b1a08d569
Summary:
Pull Request resolved: facebookincubator#15181

Persistent shuffle might need to extend serialized page to include persistent shuffle specific data structure

Reviewed By: tanjialiang

Differential Revision: D84527691

fbshipit-source-id: 07832685ea15b9250a655b275ebf382d55c8e3e2
…kincubator#15171)

Summary:
Pull Request resolved: facebookincubator#15171

These are clearer than `node->sources()[0]` or `[1]`.

Reviewed By: kgpai

Differential Revision: D84710426

fbshipit-source-id: 753ce26d991d53029e4dabb0700ae90d054c5b5d
Summary:
Pull Request resolved: facebookincubator#15174

Add cosco shuffle write trace and replay support.

Reviewed By: xiaoxmeng

Differential Revision: D84580118

fbshipit-source-id: 23ec3500d0a03fc55b203eb313049f74ad325799
…acebookincubator#15105)

Summary:
Pull Request resolved: facebookincubator#15105

X-link: facebookincubator/nimble#276

Refactored read operation parameters to use a `FileStorageContext` struct instead of separate `ioStats` and `fileReadOps` parameters. This addresses the problem where adding new parameters to read functions requires updating all implementations across the codebase.

Reviewed By: sdruzkin, Yuhta

Differential Revision: D84112628

fbshipit-source-id: 6eea1c6698850c67613d6e855daac6fb0e91b504
…15158)

Summary:
The 'cache_load_quantum' gflag is not used anywhere, so we should
remove its declaration.

Pull Request resolved: facebookincubator#15158

Reviewed By: Yuhta

Differential Revision: D84640818

Pulled By: kKPulla

fbshipit-source-id: 038e454edd26986f874b78d99ed72e4f5bb34d21
…cubator#14491)

Summary:
Fixes facebookincubator#14492, facebookincubator#14021.

Currently `BaseVector::flattenVector` doesn't unwrap lazy vectors. This patch makes it unwrap the lazy vectors.

It should also fix a bunch of vulnerabilities in the code base. For example code like:

https://github.com/facebookincubator/velox/blob/42193a8015081187e06ed4e8ed77b2bb1002a236/velox/expression/FieldReference.cpp#L176-L179

could crash the program with a lazy input.

Some historical issues that relate to this topic:

facebookincubator#6168
facebookincubator#6170
facebookincubator#8697
facebookincubator#9282

Pull Request resolved: facebookincubator#14491

Reviewed By: kKPulla

Differential Revision: D84733842

Pulled By: pedroerp

fbshipit-source-id: a48cd6d9e2ba3ed96a0829b21b9c4c9a92767377
… construction (facebookincubator#15168)

Summary:
X-link: prestodb/presto#26094

`ExchangeQueue::promises_` is a `folly::F14FastMap<int, ContinuePromise>`.
Using the subscript operator:
```
  promises_[consumerId] = std::move(promise);
```
1. Default-construct a empty `ContinuePromise` when inserting a new
   'consumerId' key.
2. However, this temporary promise is then immediately overwritten
   by the move-assignment.

However, for an empty `folly::Promise` object that is expected to be
overwritten, we should use `folly::makeEmpty()` to initialize it (it is
'invalid') instead of the default constructor (it will be 'valid' but
'not fulfilled', assigning to it will cause an exception. Creating an
exception triggers a stack unwind, which can saturate the CPU in
high-concurrency scenarios, **causing significant performance issues**.

In high-concurrency scenarios, I can see a lot of CPU consumption here,
for the reason mentioned above.

This patch replaces the subscript-based insertion with:
```
  promises_.emplace(consumerId, std::move(promise));
```
which constructs the `ContinuePromise` in place and avoids creating and
overwrite a temporary empty promise. This completely eliminates the
redundant expensive `folly::Promise` stack backtrace, and thus, saves
a lot of CPU.

Pull Request resolved: facebookincubator#15168

Reviewed By: tanjialiang

Differential Revision: D84775483

Pulled By: pedroerp

fbshipit-source-id: cc2f26291a16f72745f4bf7116bebd144acb8841
…incubator#15186)

Summary:
Pull Request resolved: facebookincubator#15186

Adding default implementation for the virtual methods in
BaseStatsReporter. The purpose here is two-fold:
* Minimize boilerplate code for specializations, so they don't need to provide
  an empty body for each virtual method on the API
* Reduce the amount of symbols leaked to downstream dependencies of this API.

Also removing unnecessary string allocation in `statTypeString()`

Reviewed By: tanjialiang

Differential Revision: D84781701

fbshipit-source-id: 98d663029a7de9d18683a43164a5b3e367fc0546
Summary:
Pull Request resolved: facebookincubator#15058

- dded new `call` method handling `arg_type<Time>` input
- TIME values stored as milliseconds since midnight (0-86399999 range)
- Extracts seconds using: `(time / kMillisecondsInSecond) % 60`
- Includes input validation for valid TIME range

Reviewed By: duxiao1212

Differential Revision: D83989772

fbshipit-source-id: 3abae3cf2f2c4af6eb6adff4774b1402d0c10318
)

Summary:
Pull Request resolved: facebookincubator#15184

- added `isConstantEncoding()` check before allocation
- for constant input: return `ConstantVector<Timestamp>` with single converted value (O(1) memory)
- for flat input: fallback to element-wise copy (O(n) memory)
- handles both null and non-null constant cases

Reviewed By: duxiao1212

Differential Revision: D84774507

fbshipit-source-id: 80e668fd23ea445475c3b1e83a6caf9ab86ee67c
Summary:
Pull Request resolved: facebookincubator#15079

- added a new `call` method in `HourFunction` that accepts `arg_type<Time>` parameter
- TIME values are stored as milliseconds since midnight (0-86399999)
- extraction logic: `result = time / kMillisInHour` where `kMillisInHour = 3600000`
- includes input validation to ensure TIME values are within valid range [0, 86400000)

Reviewed By: kgpai

Differential Revision: D84084889

fbshipit-source-id: a0fa49e880bacb20d93685ee08c68d8ba6c04ddd
…acebookincubator#15185)

Summary:
Pull Request resolved: facebookincubator#15185

Removing backward compatibility code from remote function refactor,
now that we ensured all users of that code are updated.

Reviewed By: sebastianopeluso

Differential Revision: D84776510

fbshipit-source-id: c23a7f06d80edc378b8dd40203a68a488de0b7f2
…#15155)

Summary:
Pull Request resolved: facebookincubator#15155

Log queryId, schema, user, source, and table for open file requests. Code flow:
- In `SplitReader`, we send fileReadOps as part of `FileProperties` when generating `FileHandle`
- In `FileHandle`, we extract and put fileReadOps into `FileOptions` when calling openFileForRead in `WSFileSystem`
- In `WSFileSystem`, when we create `WSReadFile`, we pass fileReadOps as a parameter and populate `fileCreateOptions.commonOptions.requestOptions`, which are the user tags we send for open file requests.

There are still some requests missing tags, specifically for SpillReadFile path, stacktrace: P1992651580. Place where we would add FileOptions: https://www.internalfb.com/code/fbsource/[13af4aa6be187a240cc71edbcbc873aa03e2ba8c]/fbcode/velox/serializers/SerializedPageFile.cpp?lines=193, but this is more tricky as it is not the normal flow where we have access to query info.

Reviewed By: Yuhta

Differential Revision: D84553856

fbshipit-source-id: a24d1b63411ad7d34ac099aba2331fa7b2fd5181
…w partitions (facebookincubator#14585)

Summary:
Extend Window operator to read spilled data in batches of window partitions to improve performance in the presence of small partitions.
A new configuration setting window_spill_min_read_batch_rows with default value of 1'000 controls the minimum number of rows for a reading batch. Setting window_spill_min_read_batch_rows to 1 loads a single partition rather than a partition batch each time.

The preferred semantic would be to set a memory budget and load as many partitions that fit. This is not feasible at the moment because (1) estimating a single row's size is not efficient or accurate enough and might cause performance issues for variable-width data; (2) spilled data format doesn't include information about how many rows are present in any given window partition.

Fixes facebookincubator#14469

Pull Request resolved: facebookincubator#14585

Reviewed By: kevinwilfong

Differential Revision: D84822583

Pulled By: pedroerp

fbshipit-source-id: f149c7c5cf32f21999fd3104d70884ee79ae0e84
…#15194)

Summary: Pull Request resolved: facebookincubator#15194

Reviewed By: tanjialiang

Differential Revision: D84853884

fbshipit-source-id: e706aa52c5f89c8aecc23295525524119c9b7986
…tor#15193)

Summary:
Pull Request resolved: facebookincubator#15193

Avoiding copying the vector of names on ROW type construction, if the
API client moves it in.

Reviewed By: bikramSingh91

Differential Revision: D84852651

fbshipit-source-id: 61564f4ffe9d8d24e96b00ba80594b9d31e33db7
Summary:
Pull Request resolved: facebookincubator#14390

# LocalRunnerService Overview
---------------------------

**LocalRunnerService** is a Thrift service that enables remote execution of Velox query plans, primarily designed for fuzzing and regression-testing to identify behavior mismatches between new changes in diffs and prior builds.

**Module Interactions:**

*   **Thrift Layer** (`if/LocalRunnerService.thrift`):

    *   Defines an interface using `execute()` as a primary driver
    *   Provides comprehensive typing congruent with Velox data types (primitives, arrays, maps, rows)
    *   Handles request/response serialization with structured result batches
*   **Service Handler** (`LocalRunnerService.cpp`):

    *   Deserializes JSON-encoded query plans into Velox `PlanNode` objects
    *   Executes plans using `AssertQueryBuilder` (from Velox test utilities)
    *   Converts Velox vector results into Thrift format (deefined above) through recursive type conversion
    *   Captures stdout and exceptions during execution for debugging and execution comparisons
    *   Returns structured results with column names, types, and data in columnar format
*   **Service Runner** (`LocalRunnerServiceRunner.cpp`):

    *   Bootstraps the Thrift server on a configurable port (default 9091)
    *   Initializes Velox subsystems (memory manager, serialization, function registrations)
    *   Runs as a standalone server process waiting for query execution requests

**Data Flow:** Client → Serialize Plan → Thrift Request → Deserialize Plan → Execute Query → Convert Results to Thrift → Response → Client

Reviewed By: kagamiori

Differential Revision: D79850066

fbshipit-source-id: a1b1904488a134140d1ec001c726a6420131ca7f
Summary:
Pull Request resolved: facebookincubator#14967

Add support for TIME type to the millisecond() function in Velox to enable extracting milliseconds from TIME values.

**Changes:**
- Added new `call` method in `MillisecondFunction` that handles TIME type
- TIME values are stored as milliseconds since midnight, so extraction uses modulo operation to get milliseconds within the current second
- Registered the new function signature `MillisecondFunction<int64_t, Time>`
- Added comprehensive tests covering various TIME values including edge cases

**Planned:**
- support for TIME WITH TIMEZONE

Reviewed By: kgpai

Differential Revision: D83291159

fbshipit-source-id: ce4ffda39df3c781a4b1bc47cb8360236b827d1e
…kincubator#15111)

Summary:
Pull Request resolved: facebookincubator#15111

Due to extra integration uncertainty in the metalake path and some previous QB signals, we decide to separately control row size tracking for metalake with the session property.

Instead of adding an additional session property, we decide to extend the current one. However, due to changing the session property type breaking backward compatibility, we still ended up introducing a new query config. We will delete the deprecated bool session property as the 3rd diff in the stack.

Reviewed By: Yuhta

Differential Revision: D84228406

fbshipit-source-id: d4720ced4dc47d140d1d835a708f8f54d1b4a224
Summary:
Pull Request resolved: facebookincubator#15081

- added `diffTime()` function that calculates differences between TIME values
- simple millisecond arithmetic with unit conversion (TIME = ms since midnight)
- supports millisecond, second, minute, hour (rejects date-related units)
- `getTimeUnit()` for TIME-specific unit validation
- `initialize()` and `call()` method overloads for `<Time, Time>` parameters
- only allows time-related units, rejects day/month/year

Reviewed By: kgpai

Differential Revision: D84103016

fbshipit-source-id: 7a090e6dbe1d41882f9a3665c53831d283fa4d74
…tor#14956)

Summary:
facebookincubator#6395 fixes a deadlock caused by allocating memory in driver creation, so we should not initialize operator in DriverAdapter.

Removed the driver_.initializeOperators().
Expose the filer node from FilterProject operator because when both project and filter exists, we can only get the project node id from op->planNodeId, then we cannot construct CudfFilterProject operator.
Move the CudfFilterProject initialization to function initialize().

Further more, if Cudf ExpressionEvaluator can get information from ITypedExpr, we can even remove compileExpression.

The cudf tests are broken, the test failed with or without this PR, other tests passed/
```
[ RUN      ] OrderByTest.singleKey
2: /opt/velox/velox/exec/tests/utils/QueryAssertions.cpp:1285: Failure
2: Failed
2: Expected keys: 999, actual: null
2: Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
2: DuckDB query: SELECT * FROM tmp WHERE c0 % 2 >= 0 ORDER BY c0 DESC NULLS FIRST
```
Resolves: facebookincubator#14943

Pull Request resolved: facebookincubator#14956

Reviewed By: mbasmanova

Differential Revision: D84876948

Pulled By: pedroerp

fbshipit-source-id: 1a955737d81ff50768bc281be98d5805899aad14
…ebookincubator#15195)

Summary:
Pull Request resolved: facebookincubator#15195

There are a few identical methods to quickly compose a serde option across the codebase. Centralize them for consistency and maintainability.

Reviewed By: xiaoxmeng

Differential Revision: D84853981

fbshipit-source-id: 8da7086d4ae84509c846a708df263c5bf2837955
…4787)

Summary:
Pull Request resolved: facebookincubator#14787

Add P4HyperLogLog cast from/to HyperLogLog
There is no cast from/to varbinary supported on Java, it has not been added here.

https://www.internalfb.com/code/fbsource/fbcode/github/presto-trunk/presto-main-base/src/main/java/com/facebook/presto/type/HyperLogLogOperators.java?lines=24%2C26

Reviewed By: kagamiori

Differential Revision: D81630497

fbshipit-source-id: 768579ccdf6d874184b2307221d73a92a5855748
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.