S2Cell::GetDistance: Optimize same-face case #479

jmr · 2025-11-24T10:40:12Z

Use the uv coordinates to prune the vertex-edge distance computations needed.

When cell A is above B, we don't need to compare the top vertices/edge of A with the bottom vertices/edge of B.

Note, however, that when A is both above and left of B, we cannot just compute the lower-right/upper-left vertex distance due to the projection.

It is possible that more comparisons can be omitted; I'm not sure.

Error is within 1e-15 radians of previous results.

GetDistance shows a 2x speedup for same-face cells.

Use the uv coordinates to prune the vertex-edge distance computations needed. When cell A is above B, we don't need to compare the top vertices/edge of A with the bottom vertices/edge of B. Note, however, that when A is both above and left of B, we cannot just compute the lower-right/upper-left vertex distance due to the projection. It is possible that more comparisons can be omitted; I'm not sure. Error is within 1e-15 radians of previous results. GetDistance shows a 2x speedup for same-face cells.

jmr · 2025-11-24T10:41:52Z

@ericveach Could you take a look a this and see if it is close to what you had in mind with your TODO here?

s2geometry/src/s2/s2cell.cc

Lines 512 to 514 in ed62eee

    
           // TODO(ericv): This could be optimized to be at least 5x faster by pruning 
        
           // the set of possible closest vertex/edge pairs using the faces and (u,v) 
        
           // ranges of both cells.

ericveach · 2025-11-27T16:14:21Z

Hi Jesse, Could you try this instead? (Again, no guarantees this will even compile, much less work.) I'm pretty sure (i.e. I think I've proved) that it's sufficient to only check the axis (u or v) where the two cells have the greatest u or v separation. In any case, testing is the easiest way to check whether I'm wrong. Eric if (face_ == target.face_) { // Find the index "ai" of the edge of A that is furthest away from the // opposite edge of B in (u,v)-space. int ai = -1; double max_dist = 0; auto checkEdge = [&ai, &max_dist](double dist, int a_edge) { if (dist > max_dist) { ai = a_edge; max_dist = dist; } }; checkEdge(uv_[0][0] - target.uv_[0][1], kLeftEdge); checkEdge(target.uv_[0][0] - uv_[0][1], kRightEdge); checkEdge(uv_[1][0] - target.uv_[1][1], kTopEdge); checkEdge(target.uv_[1][0] - uv_[1][1], kBottomEdge); if (ai < 0) { // A and B intersect (including edge and vertex intersections). return S1ChordAngle::Zero(); } // Otherwise the minimum distance always occurs between an endpoint of the // edge "ai" and the opposite edge (ai ^ 2) of B, or symmetrically, an // endpoint of the opposite edge of B and the edge "ai". int bi = ai ^ 2; S2::UpdateMinDistance(va[ai], vb[bi], vb[bi + 1], &min_dist); S2::UpdateMinDistance(va[ai + 1], vb[bi], vb[bi + 1], &min_dist); S2::UpdateMinDistance(vb[bi], va[ai], va[ai + 1], &min_dist); S2::UpdateMinDistance(vb[bi + 1], va[ai], va[ai + 1], &min_dist); return min_dist; }

…

On Mon, Nov 24, 2025 at 2:42 AM Jesse Rosenstock ***@***.***> wrote: *jmr* left a comment (google/s2geometry#479) <#479 (comment)> @ericveach <https://github.com/ericveach> Could you take a look a this and see if it is close to what you had in mind with your TODO here? https://github.com/google/s2geometry/blob/ed62eeeaa92f19c70aeaa91e8acbd3b5c1171a57/src/s2/s2cell.cc#L512-L514 — Reply to this email directly, view it on GitHub <#479 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AICG3CQDPV7UBSGNAL3MCHD36LOILAVCNFSM6AAAAACNAK3BLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKNZQGA3TEMJYHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ericveach · 2025-11-27T16:26:13Z

Somehow the formatting seems to get screwed up when I paste from a terminal window. Here's another attempt.

  if (face_ == target.face_) {
    // Find the index "ai" of the edge of A that is furthest away from the       
    // opposite edge of B in (u,v)-space.                                        
    int ai = -1;
    double max_dist = 0;
    auto checkEdge = [&ai, &max_dist](double dist, int a_edge) {
      if (dist > max_dist) {
        max_dist = dist;
        ai = a_edge;
      }
    };
    checkEdge(uv_[0][0] - target.uv_[0][1], kLeftEdge);
    checkEdge(target.uv_[0][0] - uv_[0][1], kRightEdge);
    checkEdge(uv_[1][0] - target.uv_[1][1], kTopEdge);
    checkEdge(target.uv_[1][0] - uv_[1][1], kBottomEdge);
    if (ai < 0) {
      // A and B intersect (including edge and vertex intersections).            
      return S1ChordAngle::Zero();
    }
    // Otherwise the minimum distance always occurs between an endpoint of the   
    // edge "ai" and the opposite edge (ai ^ 2) of B, or symmetrically, an       
    // endpoint of the opposite edge of B and the edge "ai".                     
    int bi = ai ^ 2;
    S2::UpdateMinDistance(va[ai], vb[bi], vb[bi + 1], &min_dist);
    S2::UpdateMinDistance(va[ai + 1], vb[bi], vb[bi + 1], &min_dist);
    S2::UpdateMinDistance(vb[bi], va[ai], va[ai + 1], &min_dist);
    S2::UpdateMinDistance(vb[bi + 1], va[ai], va[ai + 1], &min_dist);
    return min_dist;
  }

jmr · 2025-11-28T07:26:21Z

Thanks, Eric! #479 (comment) works for me after I swapped kTopEdge and kBottomEdge and used (_ + 1) & 3 instead of _ + 1. I tried something similar after I noticed the test failures were far in one direction and not the other, but I must have made a mistake when I tried it.

@ericveach

Suggested by @ericveach: google#479 (comment)

jmr · 2025-11-28T10:03:20Z

I have incorporated your suggestion, with a separate FindFurthestEdge function since this will soon be used in S2Cell::IsDistanceLess, too.

ericveach · 2025-11-28T16:50:55Z

Although I can't build the actual library right now I did try pasting some version of the code I suggested into compiler explorer, and when built under Clang the FindFurthestEdge portion is branchless. Given this and the fact that only 4 edges are tested, how much faster is it now?

Unfortunately GCC doesn't seem willing to generate branchless code, but there's not much we can do about that.

jmr · 2025-11-29T10:21:40Z

Although I can't build the actual library right now I did try pasting some version of the code I suggested into compiler explorer, and when built under Clang the FindFurthestEdge portion is branchless. Given this and the fact that only 4 edges are tested, how much faster is it now?

UpdateMinDistance is a 3.8x speedup, and GetVertex is 1.5x. Based on operation count, 8x and 2x would be ideal.

Before it was 80% UpdateMinDistance and 15% GetVertex, after it's 60%30%.

This gives an overall 3x speedup. We can take the win and continue to optimize. The largest chunk of the remaining time is in vector Normalize and Norm2.

Unfortunately GCC doesn't seem willing to generate branchless code, but there's not much we can do about that.

I can get gcc to generate branchless code by giving it a branch probability somewhere between 0.21 and 0.33.

https://godbolt.org/z/rhqGcrqE9

It would be interesting to see the actual branch probability and what FDO did.

jmr · 2025-11-29T13:14:18Z

It's also worth mentioning that only 1/5 as many instructions were executed and IPC decreased from 3.2 to 1.8 (less unnecessary extra work to be done in parallel).

Working on releasing benchmarks now.

ericveach · 2025-11-29T15:57:44Z

Thanks for the comprehensive analysis! It looks like the performance analysis tools available have improved a lot in the past few years.

Just to clarify, are the results you mentioned for random cells, or for cells on the same face, or for cells that are small relative to their separation distance? I suspect that the last case may be the most important in practice, e.g. where the separation distance may be up to a few hundred or thousands of km but the cell sizes are, say, only 1% or 10% of that distance. This is the situation you would often have when measuring distances between coverings of real-world geometry, for example.

ericveach · 2025-11-29T15:59:18Z

Maybe your existing change is big enough, but if you wanted to try adding the vertex-only case, here is a stab at it:

  if (face_ == target.face_) {
    // In certain cases the distance between two cells is attained between a     
    // pair of vertices.  This makes the distance very cheap to compute and so   
    // it's worth detecting the easy cases where this happens.  Recall that      
    // all cells except at level 0 are slightly diamond-shaped, i.e. one         
    // diagonal is slightly longer than the other and the corresponding cell     
    // corners are either right-angled or acute.  Define a diagonal to have      
    // positive slope if u and v both increase along it, and negative slopee     
    // otherwise.  Furthermore define two cells to be separated along a          
    // positive diagonal if one has strictly larger u- and v-values than the     
    // other, and along a negative diagonal if one cell has strictly smaller     
    // u-values and strictly larger v-values or vice versa.  Then if two cells   
    // A and B are separated along a positive diagonal and the long diagonals    
    // of A and B both have positive slope, then the minimum distance occurs     
    // between two vertices; namely, the closer endpoints of their long          
    // diagonals.  The same is true if the two cells are separated along a       
    // negative diagonal and both long diagonals have negative slope.  One of    
    // these two situations is expected to occur almost 50% of the time when     
    // the cells are on the same face and are small relative to their            
    // separation distance.                                                      
    //                                                                           
    // For the purpose of determining whether cells are separated along a        
    // diagonal, as described above, we assign constants to the four edges of    
    // A as follows: L=1, T=3, R=3, B=9.  "sep_dirs" is the sum of these         
    // values for the edges of A that separate A from B.  This yields the        
    // following possible sums: TL=4, TR=6, BL=10, BR=12, L=1, T=3, R=3, B=9,    
    // none=0.  This lets us test the following conditions cheaply, given that   
    // we test them in the following order:                                      
    //                                                                           
    //  sep_dirs == 0 : the two cells intersect                                  
    //  !(sep_dirs & 1) : cells are separated along a diagonal (TL, TR, BL, BR)  
    //  (sep_dirs & 2) : cells are separated along a positive diagonal (TL, BR)  
    //  (sep_dirs < 8) : cell A  cell B (TL, TR)                                 

    int sep_dirs = 0;
    R2Rect a_uv = a.GetBoundUV();
    R2Rect b_uv = b.GetBoundUV();
    if (a_uv[0][0] > b_uv[0][1]) sep_dirs += 1;  // left side of A               
    if (a_uv[0][1] < b_uv[0][0]) sep_dirs += 3;  // right                        
    if (a_uv[1][0] > b_uv[1][1]) sep_dirs += 9;  // bottom                       
    if (a_uv[1][1] < b_uv[1][0]) sep_dirs += 3;  // top                          
    if (sep_dirs == 0) {
      // A and B intersect (this includes edge and vertex intersections).        
      return S1ChordAngle::Zero();
    }
    // Otherwise if the two cells are separated along a diagonal, check if the   
    // cells also have their long diagonals in that direction.                   
    bool separated = !(sep_dirs & 1);  // TL, TR, BL, BR                         
    if (separated) {
      bool sep_positive = (sep_dirs & 2) != 0;  // TR or BL                      
      bool a_positive = (a_uv[0][0] < 0) != (a_uv[1][0] < 0);
      bool b_positive = (b_uv[0][0] < 0) != (b_uv[1][0] < 0);
      // The use of & rather than && below encourages branchless compilation.    
      if (a_positive == sep_positive & b_positive == sep_positive) {
        // Compute the vertex of A that is closest to B, without branches.       
        // Vertices are numbered as follows: BL=0, BR=1, TR=2, TL=3.             
        // a_positive implies that the closest vertex to B is TR or BL.          
        // a_top_edge implies that the closest vertex to B is TR or TL.          
        bool a_top_edge = sep_dirs < 8;  // TR or TL                             
        int i = (a_positive ? 0 : 1) + (a_top_edge ? 2 : 0);
        return S1ChordAngle(GetVertex(i), target_.GetVertex(i ^ 2));
      }
    }
    // Otherwise carry on with the existing code, except that the (ai < 0) case is no longer needed.
  }

Again, no guarantees that this will work or even compile. But at least in the case of cells that are small relative to their separation distance, and where the separation distance is small relative to the Earth's radius, it should yield a substantial speedup.

jmr · 2025-12-01T08:29:05Z

It looks like the performance analysis tools available have improved a lot in the past few years.

Definitely. I used:

--benchmark_perf_counters=CYCLES,INSTRUCTIONS: https://github.com/google/benchmark/blob/main/docs/perf_counters.md
Staring at flame graphs in pprof. I'm not sure of the state of the internal vs external versions, but the external one does support flame graphs: google/pprof@8b542ba
benchstat to compare benchmark output: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat

Just to clarify, are the results you mentioned for random cells, or for cells on the same face, or for cells that are small relative to their separation distance?

Random same-face cells.

// Copyright 2025 Google LLC.
// SPDX-License-Identifier: Apache-2.0
static void BM_GetDistanceToCellSameFace(benchmark::State& state) {
  const string seed_str = StrCat("GET_DISTANCE_TO_CELL_SAME_FACE",
                                 absl::GetFlag(FLAGS_s2_random_seed));
  std::seed_seq seed(seed_str.begin(), seed_str.end());
  std::mt19937_64 bitgen(seed);
  std::vector<S2Cell> cells;
  cells.reserve(kBatchSize);
  for (int i = 0; i < kBatchSize; ++i) {
    // Make a cell id and move it to face 0.
    S2CellId cellid = s2random::CellId(bitgen);
    cells.emplace_back(
        S2CellId::FromFacePosLevel(0, cellid.pos(), cellid.level()));
  }

  int i = 0;
  while (state.KeepRunningBatch(kBatchSize)) {
    const S2Cell& cell1 = (cells)[i];
    for (const S2Cell& cell2 : cells) {
      S1ChordAngle distance = cell1.GetDistance(cell2);
      benchmark::DoNotOptimize(distance);
    }
    if (++i == kBatchSize) i = 0;
  }
}
BENCHMARK(BM_GetDistanceToCellSameFace);

A more realistic version of this could be done. There are also larger-scale benchmarks.

Re #479 (comment). Thanks for that. I will try it, but definitely separately.

jmr · 2025-12-01T10:19:17Z

When I naively try #479 (comment), it's about 5% faster on BM_GetDistanceToCellSameFace. Some of the other benchmarks also show similar speedups,others don't. I didn't run all the benchmarks yet, and won't have time to look at this in detail for a while.

ericveach · 2025-12-01T15:56:59Z

Thanks for the trying that and also the profiling info. Your benchmark looks pretty good; the only downside is that I think it probably significantly overweights large cells, since s2random::S2CellId() chooses the cell level randomly between 0 and 30. So for example, ~10% of the cells will be 2500km across or larger, and ~19% of random pairs will involve a cell at least this big. This means the test is significantly weighted towards pairs that overlap or that are not well separated relative to their size. If you wanted to test the well-separated case specifically, cell levels [10..30] might be an appropriate choice.

src/s2/s2cell.cc

src/s2/s2cell_test.cc

* Add S2Cell::IsDistanceLess implementation along the lines of GetDistance * Rename "HighErrorExample" test to "HighDifferenceExample" * Use UpdateMinInteriorDistance when endpoints have already been checked * Reword FindFurthestEdge comment * Rename GetDistanceToCellBruteForce args

src/s2/s2cell.cc

ericveach

Looks good!

src/s2/s2cell.cc

google#479 (comment)

ericveach · 2025-12-08T03:08:15Z

No, I was just trying to clarify what this test is actually testing. It's fine as is.

…

On Sun, Dec 7, 2025 at 6:39 AM Jesse Rosenstock ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/s2/s2cell_test.cc <#479 (comment)>: > + SCOPED_TRACE(StrCat("Iteration ", iter)); + S2Cell cell1(s2random::CellId(bitgen)); + S2Cell cell2(s2random::CellId(bitgen)); + S1ChordAngle expected = GetDistanceToCellBruteForce(cell1, cell2); + S1ChordAngle actual = cell1.GetDistance(cell2); + EXPECT_NEAR(expected.radians(), actual.radians(), 1e-15) + << "cell1: " << cell1.id() << " cell2: " << cell2.id() + << " cell1.uv: " << cell1.GetBoundUV() + << " cell2.uv: " << cell2.GetBoundUV(); + } +} + +TEST(S2Cell, GetDistanceToCellHighErrorExample) { + // This is a test case extracted from `GetDistanceToCell`; it achieved + // the maximum error over 100M iterations, so is useful to understand + // the errors and as a shortcut for testing. Ok. I thought you wanted me to add a test using GetUpdateMinDistanceMaxError. — Reply to this email directly, view it on GitHub <#479 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AICG3CVXWK5JQQGVNS77QNL4APYVHAVCNFSM6AAAAACNAK3BLSVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTKNBZGE4DSNBVGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jmr · 2025-12-08T07:31:42Z

Eric, thank you for your time and careful attention.

jmr mentioned this pull request Nov 24, 2025

Optimize S2Cell::GetDistance #467

Open

Only check direction with max uv distance

2ff8bdb

Suggested by @ericveach: google#479 (comment)

ericveach approved these changes Dec 2, 2025

View reviewed changes

src/s2/s2cell.cc Outdated Show resolved Hide resolved

src/s2/s2cell.cc Outdated Show resolved Hide resolved

src/s2/s2cell_test.cc Outdated Show resolved Hide resolved

src/s2/s2cell_test.cc Show resolved Hide resolved

src/s2/s2cell_test.cc Outdated Show resolved Hide resolved

jmr commented Dec 2, 2025

View reviewed changes

src/s2/s2cell.cc Show resolved Hide resolved

ericveach approved these changes Dec 4, 2025

View reviewed changes

src/s2/s2cell.cc Show resolved Hide resolved

IsDistanceLess: Replace if .. return cascade with ||

7cf212a

google#479 (comment)

jmr merged commit dc9df71 into google:master Dec 8, 2025
11 checks passed

jmr deleted the cell-cell-dist branch December 8, 2025 07:31

S2Cell::GetDistance: Optimize same-face case #479

S2Cell::GetDistance: Optimize same-face case #479

Uh oh!

Conversation

jmr commented Nov 24, 2025

Uh oh!

jmr commented Nov 24, 2025

Uh oh!

ericveach commented Nov 27, 2025 via email

Uh oh!

ericveach commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmr commented Nov 28, 2025

Uh oh!

jmr commented Nov 28, 2025

Uh oh!

ericveach commented Nov 28, 2025

Uh oh!

jmr commented Nov 29, 2025

Uh oh!

jmr commented Nov 29, 2025

Uh oh!

ericveach commented Nov 29, 2025

Uh oh!

ericveach commented Nov 29, 2025

Uh oh!

jmr commented Dec 1, 2025

Uh oh!

jmr commented Dec 1, 2025

Uh oh!

ericveach commented Dec 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ericveach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ericveach commented Dec 8, 2025 via email

Uh oh!

Uh oh!

jmr commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericveach commented Nov 27, 2025 •

edited

Loading