CSS_Project/models/numba_optimized.py at main · codegithubka/CSS_Project

History

1609 lines (1366 loc) · 52.2 KB

Raw

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

991

992

993

994

995

996

997

998

999

1000

#!/usr/bin/env python3

"""

Numba-Optimized Kernels

=======================

This module provides Numba-accelerated kernels for the predator-prey

cellular automaton, including update kernels and spatial analysis functions.

Classes

-------

PPKernel

Wrapper for predator-prey update kernels with pre-allocated buffers.

Cluster Analysis

----------------

```python

measure_cluster_sizes_fast # Fast cluster size measurement (sizes only).

detect_clusters_fast # Full cluster detection with labels.

get_cluster_stats_fast # Comprehensive cluster statistics.

```

Pair Correlation Functions

--------------------------

```python

compute_pcf_periodic_fast # PCF for two position sets with periodic boundaries.

compute_all_pcfs_fast #Compute prey-prey, pred-pred, and prey-pred PCFs.

```

Utilities

---------

```python

set_numba_seed # Seed Numba's internal RNG.

warmup_numba_kernels # Pre-compile kernels to avoid first-run latency.

```

Example

-------

```python

from models.numba_optimized import (

PPKernel,

get_cluster_stats_fast,

compute_all_pcfs_fast,

)

# Cluster analysis

stats = get_cluster_stats_fast(grid, species=1)

print(f"Largest cluster: {stats['largest']}")

# PCF computation

pcfs = compute_all_pcfs_fast(grid, max_distance=20.0)

prey_prey_dist, prey_prey_gr, _ = pcfs['prey_prey']

```

"""

import numpy as np

from typing import Tuple, Dict, Optional

try:

from numba import njit, prange

NUMBA_AVAILABLE = True

except ImportError:

NUMBA_AVAILABLE = False

def njit(*args, **kwargs):

def decorator(func):

return func

return decorator

def prange(*args):

return range(*args)

# ============================================================================

# RNG SEEDING

# ============================================================================

@njit(cache=True)

def set_numba_seed(seed: int) -> None:

"""

Seed Numba's internal random number generator from within a JIT context.

This function ensures that Numba's independent random number generator

is synchronized with the provided seed, enabling reproducibility for

jit-compiled functions that use NumPy's random operations.

Parameters

----------

seed : int

The integer value used to initialize the random number generator.

Returns

-------

None

Notes

-----

Because Numba maintains its own internal state for random number

generation, calling `np.random.seed()` in standard Python code will not

affect jit-compiled functions. This helper must be called to bridge

that gap.

"""

np.random.seed(seed)

# ============================================================================

# PREDATOR-PREY KERNELS

# ============================================================================

@njit(cache=True)

def _pp_async_kernel_random(

grid: np.ndarray,

prey_death_arr: np.ndarray,

p_birth_val: float,

p_death_val: float,

pred_birth_val: float,

pred_death_val: float,

dr_arr: np.ndarray,

dc_arr: np.ndarray,

evolve_sd: float,

evolve_min: float,

evolve_max: float,

evolution_stopped: bool,

occupied_buffer: np.ndarray,

) -> np.ndarray:

"""

Asynchronous predator-prey update kernel with random neighbor selection.

This Numba-accelerated kernel performs an asynchronous update of the

simulation grid. It identifies all occupied cells, shuffles them to

ensure unbiased processing, and applies stochastic rules for prey

mortality, prey reproduction (with optional parameter evolution),

predator mortality, and predation.

Parameters

----------

grid : np.ndarray

2D integer array representing the simulation grid (0: Empty, 1: Prey, 2: Predator).

prey_death_arr : np.ndarray

2D float array storing the individual prey death rates for evolution tracking.

p_birth_val : float

Base probability of prey reproduction into an adjacent empty cell.

p_death_val : float

Base probability of prey death (though individual rates in `prey_death_arr` are used).

pred_birth_val : float

Probability of a predator reproducing after consuming prey.

pred_death_val : float

Probability of a predator dying.

dr_arr : np.ndarray

Array of row offsets defining the neighborhood.

dc_arr : np.ndarray

Array of column offsets defining the neighborhood.

evolve_sd : float

Standard deviation of the mutation applied to the prey death rate during reproduction.

evolve_min : float

Lower bound for the evolved prey death rate.

evolve_max : float

Upper bound for the evolved prey death rate.

evolution_stopped : bool

If True, offspring inherit the parent's death rate without mutation.

occupied_buffer : np.ndarray

Pre-allocated 2D array used to store and shuffle coordinates of occupied cells.

Returns

-------

grid : np.ndarray

The updated simulation grid.

Notes

-----

The kernel uses periodic boundary conditions. The Fisher-Yates shuffle on

`occupied_buffer` ensures that the asynchronous updates do not introduce

directional bias.

"""

rows, cols = grid.shape

n_shifts = len(dr_arr)

# Collect occupied cells

count = 0

for r in range(rows):

for c in range(cols):

if grid[r, c] != 0:

occupied_buffer[count, 0] = r

occupied_buffer[count, 1] = c

count += 1

# Fisher-Yates shuffle

for i in range(count - 1, 0, -1):

j = np.random.randint(0, i + 1)

occupied_buffer[i, 0], occupied_buffer[j, 0] = (

occupied_buffer[j, 0],

occupied_buffer[i, 0],

)

occupied_buffer[i, 1], occupied_buffer[j, 1] = (

occupied_buffer[j, 1],

occupied_buffer[i, 1],

)

# Process each occupied cell

for i in range(count):

r = occupied_buffer[i, 0]

c = occupied_buffer[i, 1]

state = grid[r, c]

if state == 0:

continue

# Random neighbor selection

nbi = np.random.randint(0, n_shifts)

nr = (r + dr_arr[nbi]) % rows

nc = (c + dc_arr[nbi]) % cols

if state == 1: # PREY

if np.random.random() < prey_death_arr[r, c]:

grid[r, c] = 0

prey_death_arr[r, c] = np.nan

elif grid[nr, nc] == 0:

if np.random.random() < p_birth_val:

grid[nr, nc] = 1

parent_val = prey_death_arr[r, c]

if not evolution_stopped:

child_val = parent_val + np.random.normal(0, evolve_sd)

if child_val < evolve_min:

child_val = evolve_min

if child_val > evolve_max:

child_val = evolve_max

prey_death_arr[nr, nc] = child_val

else:

prey_death_arr[nr, nc] = parent_val

elif state == 2: # PREDATOR

if np.random.random() < pred_death_val:

grid[r, c] = 0

elif grid[nr, nc] == 1:

if np.random.random() < pred_birth_val:

grid[nr, nc] = 2

prey_death_arr[nr, nc] = np.nan

return grid

@njit(cache=True)

def _pp_async_kernel_directed(

grid: np.ndarray,

prey_death_arr: np.ndarray,

p_birth_val: float,

p_death_val: float,

pred_birth_val: float,

pred_death_val: float,

dr_arr: np.ndarray,

dc_arr: np.ndarray,

evolve_sd: float,

evolve_min: float,

evolve_max: float,

evolution_stopped: bool,

occupied_buffer: np.ndarray,

) -> np.ndarray:

"""

Asynchronous predator-prey update kernel with directed behavior.

This kernel implements "intelligent" species behavior: prey actively search

for empty spaces to reproduce, and predators actively search for nearby

prey to hunt. A two-pass approach is used to stochastically select a

valid target from the neighborhood without heap allocation.

Parameters

----------

grid : np.ndarray

2D integer array representing the simulation grid (0: Empty, 1: Prey, 2: Predator).

prey_death_arr : np.ndarray

2D float array storing individual prey mortality rates for evolution.

p_birth_val : float

Probability of prey reproduction attempt.

p_death_val : float

Base probability of prey mortality.

pred_birth_val : float

Probability of a predator reproduction attempt (hunting success).

pred_death_val : float

Probability of predator mortality.

dr_arr : np.ndarray

Row offsets defining the spatial neighborhood (e.g., Moore or von Neumann).

dc_arr : np.ndarray

Column offsets defining the spatial neighborhood.

evolve_sd : float

Standard deviation for mutations in prey death rates.

evolve_min : float

Minimum allowable value for evolved prey death rates.

evolve_max : float

Maximum allowable value for evolved prey death rates.

evolution_stopped : bool

If True, prevents mutation during prey reproduction.

occupied_buffer : np.ndarray

Pre-allocated array for storing and shuffling active cell coordinates.

Returns

-------

grid : np.ndarray

The updated simulation grid.

Notes

-----

The directed behavior significantly changes the system dynamics compared to

random neighbor selection, often leading to different critical thresholds

and spatial patterning. Periodic boundary conditions are applied.

"""

rows, cols = grid.shape

n_shifts = len(dr_arr)

# Collect occupied cells

count = 0

for r in range(rows):

for c in range(cols):

if grid[r, c] != 0:

occupied_buffer[count, 0] = r

occupied_buffer[count, 1] = c

count += 1

# Fisher-Yates shuffle

for i in range(count - 1, 0, -1):

j = np.random.randint(0, i + 1)

occupied_buffer[i, 0], occupied_buffer[j, 0] = (

occupied_buffer[j, 0],

occupied_buffer[i, 0],

)

occupied_buffer[i, 1], occupied_buffer[j, 1] = (

occupied_buffer[j, 1],

occupied_buffer[i, 1],

)

# Process each occupied cell

for i in range(count):

r = occupied_buffer[i, 0]

c = occupied_buffer[i, 1]

state = grid[r, c]

if state == 0:

continue

if state == 1: # PREY - directed reproduction into empty cells

# Check for death first

if np.random.random() < prey_death_arr[r, c]:

grid[r, c] = 0

prey_death_arr[r, c] = np.nan

continue

# Attempt reproduction with directed selection

if np.random.random() < p_birth_val:

# Pass 1: Count empty neighbors

empty_count = 0

for k in range(n_shifts):

check_r = (r + dr_arr[k]) % rows

check_c = (c + dc_arr[k]) % cols

if grid[check_r, check_c] == 0:

empty_count += 1

# Pass 2: Select random empty neighbor

if empty_count > 0:

target_idx = np.random.randint(0, empty_count)

found = 0

nr, nc = r, c # Initialize (will be overwritten)

for k in range(n_shifts):

check_r = (r + dr_arr[k]) % rows

check_c = (c + dc_arr[k]) % cols

if grid[check_r, check_c] == 0:

if found == target_idx:

nr, nc = check_r, check_c

break

found += 1

# Reproduce into selected empty cell

grid[nr, nc] = 1

parent_val = prey_death_arr[r, c]

if not evolution_stopped:

child_val = parent_val + np.random.normal(0, evolve_sd)

if child_val < evolve_min:

child_val = evolve_min

if child_val > evolve_max:

child_val = evolve_max

prey_death_arr[nr, nc] = child_val

else:

prey_death_arr[nr, nc] = parent_val

elif state == 2: # PREDATOR - directed hunting

# Check for death first

if np.random.random() < pred_death_val:

grid[r, c] = 0

continue

# Attempt hunting with directed selection

if np.random.random() < pred_birth_val:

# Pass 1: Count prey neighbors

prey_count = 0

for k in range(n_shifts):

check_r = (r + dr_arr[k]) % rows

check_c = (c + dc_arr[k]) % cols

if grid[check_r, check_c] == 1:

prey_count += 1

# Pass 2: Select random prey neighbor

if prey_count > 0:

target_idx = np.random.randint(0, prey_count)

found = 0

nr, nc = r, c # Initialize (will be overwritten)

for k in range(n_shifts):

check_r = (r + dr_arr[k]) % rows

check_c = (c + dc_arr[k]) % cols

if grid[check_r, check_c] == 1:

if found == target_idx:

nr, nc = check_r, check_c

break

found += 1

# Hunt: prey cell becomes predator

grid[nr, nc] = 2

prey_death_arr[nr, nc] = np.nan

return grid

class PPKernel:

"""

Wrapper for predator-prey kernel with pre-allocated buffers.

This class manages the spatial configuration and memory buffers required

for the Numba-accelerated update kernels. By pre-allocating the

`occupied_buffer`, it avoids expensive memory allocations during the

simulation loop.

Parameters

----------

rows : int

Number of rows in the simulation grid.

cols : int

Number of columns in the simulation grid.

neighborhood : {'moore', 'von_neumann'}, optional

The neighborhood type determining adjacent cells. 'moore' includes

diagonals (8 neighbors), 'von_neumann' does not (4 neighbors).

Default is 'moore'.

directed_hunting : bool, optional

If True, uses the directed behavior kernel where species search for

targets. If False, uses random neighbor selection. Default is False.

Attributes

----------

rows : int

Grid row count.

cols : int

Grid column count.

directed_hunting : bool

Toggle for intelligent behavior logic.

"""

def __init__(

self,

rows: int,

cols: int,

neighborhood: str = "moore",

directed_hunting: bool = False,

self.rows = rows

self.cols = cols

self.directed_hunting = directed_hunting

self._occupied_buffer = np.empty((rows * cols, 2), dtype=np.int32)

if neighborhood == "moore":

self._dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32)

self._dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32)

else: # von Neumann

self._dr = np.array([-1, 1, 0, 0], dtype=np.int32)

self._dc = np.array([0, 0, -1, 1], dtype=np.int32)

def update(

self,

grid: np.ndarray,

prey_death_arr: np.ndarray,

prey_birth: float,

prey_death: float,

pred_birth: float,

pred_death: float,

evolve_sd: float = 0.1,

evolve_min: float = 0.001,

evolve_max: float = 0.1,

evolution_stopped: bool = True,

) -> np.ndarray:

"""

Execute a single asynchronous update step using the configured kernel.

Parameters

----------

grid : np.ndarray

The current 2D simulation grid.

prey_death_arr : np.ndarray

2D array of individual prey mortality rates.

prey_birth : float

Prey reproduction probability.

prey_death : float

Base prey mortality probability.

pred_birth : float

Predator reproduction (hunting success) probability.

pred_death : float

Predator mortality probability.

evolve_sd : float, optional

Mutation standard deviation (default 0.1).

evolve_min : float, optional

Minimum evolved death rate (default 0.001).

evolve_max : float, optional

Maximum evolved death rate (default 0.1).

evolution_stopped : bool, optional

Whether to disable mutation during this step (default True).

Returns

-------

np.ndarray

The updated grid after one full asynchronous pass.

"""

if self.directed_hunting:

return _pp_async_kernel_directed(

grid,

prey_death_arr,

prey_birth,

prey_death,

pred_birth,

pred_death,

self._dr,

self._dc,

evolve_sd,

evolve_min,

evolve_max,

evolution_stopped,

self._occupied_buffer,

)

else:

return _pp_async_kernel_random(

grid,

prey_death_arr,

prey_birth,

prey_death,

pred_birth,

pred_death,

self._dr,

self._dc,

evolve_sd,

evolve_min,

evolve_max,

evolution_stopped,

self._occupied_buffer,

)

# ============================================================================

# CLUSTER DETECTION (ENHANCED)

# ============================================================================

@njit(cache=True)

def _flood_fill(

grid: np.ndarray,

visited: np.ndarray,

start_r: int,

start_c: int,

target: int,

rows: int,

cols: int,

moore: bool,

) -> int:

"""

Perform a stack-based flood fill to measure the size of a connected cluster.

This Numba-accelerated function identifies all contiguous cells of a

specific target value starting from a given coordinate. It supports

both Moore and von Neumann neighborhoods and implements periodic

boundary conditions (toroidal topology).

Parameters

----------

grid : np.ndarray

2D integer array representing the simulation environment.

visited : np.ndarray

2D boolean array tracked across calls to avoid re-processing cells.

start_r : int

Starting row index for the flood fill.

start_c : int

Starting column index for the flood fill.

target : int

The cell value (e.g., 1 for Prey, 2 for Predator) to include in the cluster.

rows : int

Total number of rows in the grid.

cols : int

Total number of columns in the grid.

moore : bool

If True, use a Moore neighborhood (8 neighbors). If False, use a

von Neumann neighborhood (4 neighbors).

Returns

-------

size : int

The total number of connected cells belonging to the cluster.

Notes

-----

The function uses a manual stack implementation to avoid recursion limit

issues and is optimized for use within JIT-compiled loops.

"""

max_stack = rows * cols

stack_r = np.empty(max_stack, dtype=np.int32)

stack_c = np.empty(max_stack, dtype=np.int32)

stack_ptr = 0

stack_r[stack_ptr] = start_r

stack_c[stack_ptr] = start_c

stack_ptr += 1

visited[start_r, start_c] = True

size = 0

if moore:

dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32)

dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32)

n_neighbors = 8

else:

dr = np.array([-1, 1, 0, 0], dtype=np.int32)

dc = np.array([0, 0, -1, 1], dtype=np.int32)

n_neighbors = 4

while stack_ptr > 0:

stack_ptr -= 1

r = stack_r[stack_ptr]

c = stack_c[stack_ptr]

size += 1

for k in range(n_neighbors):

nr = (r + dr[k]) % rows

nc = (c + dc[k]) % cols

if not visited[nr, nc] and grid[nr, nc] == target:

visited[nr, nc] = True

stack_r[stack_ptr] = nr

stack_c[stack_ptr] = nc

stack_ptr += 1

return size

@njit(cache=True)

def _measure_clusters(grid: np.ndarray, species: int, moore: bool = True) -> np.ndarray:

"""

Identify and measure the sizes of all connected clusters for a specific species.

This function scans the entire grid and initiates a flood-fill algorithm

whenever an unvisited cell of the target species is encountered. It

returns an array containing the size (cell count) of each identified cluster.

Parameters

----------

grid : np.ndarray

2D integer array representing the simulation environment.

species : int

The target species identifier (e.g., 1 for Prey, 2 for Predator).

moore : bool, optional

Determines the connectivity logic. If True, uses the Moore neighborhood

(8 neighbors); if False, uses the von Neumann neighborhood (4 neighbors).

Default is True.

Returns

-------

cluster_sizes : np.ndarray

A 1D array of integers where each element represents the size of

one connected cluster.

Notes

-----

This function is Numba-optimized and utilizes an internal `visited` mask

to ensure each cell is processed only once, maintaining $O(N)$

complexity relative to the number of cells.

"""

rows, cols = grid.shape

visited = np.zeros((rows, cols), dtype=np.bool_)

max_clusters = rows * cols

sizes = np.empty(max_clusters, dtype=np.int32)

n_clusters = 0

for r in range(rows):

for c in range(cols):

if grid[r, c] == species and not visited[r, c]:

size = _flood_fill(grid, visited, r, c, species, rows, cols, moore)

sizes[n_clusters] = size

n_clusters += 1

return sizes[:n_clusters]

@njit(cache=True)

def _detect_clusters_numba(

grid: np.ndarray,

species: int,

moore: bool,

) -> Tuple[np.ndarray, np.ndarray]:

"""

Full cluster detection returning labels and sizes (Numba-accelerated).

Returns:

labels: 2D int32 array where each cell contains its cluster ID (0 = non-target)

sizes: 1D int32 array of cluster sizes (index i = size of cluster i+1)

"""

rows, cols = grid.shape

labels = np.zeros((rows, cols), dtype=np.int32)

if moore:

dr = np.array([-1, -1, -1, 0, 0, 1, 1, 1], dtype=np.int32)

dc = np.array([-1, 0, 1, -1, 1, -1, 0, 1], dtype=np.int32)

n_neighbors = 8

else:

dr = np.array([-1, 1, 0, 0], dtype=np.int32)

dc = np.array([0, 0, -1, 1], dtype=np.int32)

n_neighbors = 4

max_clusters = rows * cols

sizes = np.empty(max_clusters, dtype=np.int32)

n_clusters = 0

current_label = 1

max_stack = rows * cols

stack_r = np.empty(max_stack, dtype=np.int32)

stack_c = np.empty(max_stack, dtype=np.int32)

for start_r in range(rows):

for start_c in range(cols):

if grid[start_r, start_c] != species or labels[start_r, start_c] != 0:

continue

stack_ptr = 0

stack_r[stack_ptr] = start_r

stack_c[stack_ptr] = start_c

stack_ptr += 1

labels[start_r, start_c] = current_label

size = 0

while stack_ptr > 0:

stack_ptr -= 1

r = stack_r[stack_ptr]

c = stack_c[stack_ptr]

size += 1

for k in range(n_neighbors):

nr = (r + dr[k]) % rows

nc = (c + dc[k]) % cols

if grid[nr, nc] == species and labels[nr, nc] == 0:

labels[nr, nc] = current_label

stack_r[stack_ptr] = nr

stack_c[stack_ptr] = nc

stack_ptr += 1

sizes[n_clusters] = size

n_clusters += 1

current_label += 1

return labels, sizes[:n_clusters]

# ============================================================================

# PUBLIC API - CLUSTER DETECTION

# ============================================================================

def measure_cluster_sizes_fast(

grid: np.ndarray,

species: int,

neighborhood: str = "moore",

) -> np.ndarray:

"""

Measure cluster sizes for a specific species using Numba-accelerated flood fill.

This function provides a high-performance interface for calculating cluster

size statistics without the overhead of generating a full label map. It is

optimized for large-scale simulation analysis where only distribution

metrics (e.g., mean size, max size) are required.

Parameters

----------

grid : np.ndarray

A 2D array representing the simulation environment.

species : int

The target species identifier (e.g., 1 for Prey, 2 for Predator).

neighborhood : {'moore', 'neumann'}, optional

The connectivity rule. 'moore' uses 8-way connectivity (including diagonals);

'neumann' uses 4-way connectivity. Default is 'moore'.

Returns

-------

cluster_sizes : np.ndarray

A 1D array of integers, where each element is the cell count of an

identified cluster.

Notes

-----

The input grid is cast to `int32` to ensure compatibility with the

underlying JIT-compiled `_measure_clusters` kernel.

Examples

--------

>>> sizes = measure_cluster_sizes_fast(grid, species=1, neighborhood='moore')

>>> if sizes.size > 0:

... print(f"Largest cluster: {sizes.max()}")

"""

grid_int = np.asarray(grid, dtype=np.int32)

moore = neighborhood == "moore"

return _measure_clusters(grid_int, np.int32(species), moore)

def detect_clusters_fast(

grid: np.ndarray,

species: int,

neighborhood: str = "moore",

) -> Tuple[np.ndarray, Dict[int, int]]:

"""

Perform full cluster detection with labels using Numba acceleration.

This function returns a label array for spatial analysis and a dictionary

of cluster sizes. It is significantly faster than standard Python or

SciPy equivalents for large simulation grids.

Parameters

----------

grid : np.ndarray

A 2D array representing the simulation environment.

species : int

The target species identifier (e.g., 1 for Prey, 2 for Predator).

neighborhood : {'moore', 'neumann'}, optional

The connectivity rule. 'moore' uses 8-way connectivity; 'neumann'

uses 4-way connectivity. Default is 'moore'.

Returns

-------

labels : np.ndarray

A 2D int32 array where each cell contains its unique cluster ID.

Cells not belonging to the target species are 0.

sizes : dict

A dictionary mapping cluster IDs to their respective cell counts.

Notes

-----

The underlying Numba kernel uses a stack-based flood fill to avoid

recursion limits and handles periodic boundary conditions.

Examples

--------

>>> labels, sizes = detect_clusters_fast(grid, species=1)

>>> if sizes:

... largest_id = max(sizes, key=sizes.get)

... print(f"Cluster {largest_id} size: {sizes[largest_id]}")

"""

grid_int = np.asarray(grid, dtype=np.int32)

moore = neighborhood == "moore"

labels, sizes_arr = _detect_clusters_numba(grid_int, np.int32(species), moore)

sizes_dict = {i + 1: int(sizes_arr[i]) for i in range(len(sizes_arr))}

return labels, sizes_dict

def get_cluster_stats_fast(

grid: np.ndarray,

species: int,

neighborhood: str = "moore",

) -> Dict:

"""

Compute comprehensive cluster statistics for a species using Numba acceleration.

This function integrates cluster detection and labeling to provide a

full suite of spatial metrics. It calculates the cluster size distribution

and the largest cluster fraction, which often serves as an order

parameter in percolation theory and Phase 1-3 analyses.

Parameters

----------

grid : np.ndarray

A 2D array representing the simulation environment.

species : int

The target species identifier (e.g., 1 for Prey, 2 for Predator).

neighborhood : {'moore', 'neumann'}, optional

The connectivity rule. 'moore' uses 8-way connectivity; 'neumann'

uses 4-way connectivity. Default is 'moore'.

Returns

-------

stats : dict

A dictionary containing:

- 'n_clusters': Total count of isolated clusters.

- 'sizes': Sorted array (descending) of all cluster sizes.

- 'largest': Size of the single largest cluster.

- 'largest_fraction': Size of the largest cluster divided by

the total population of the species.

- 'mean_size': Average size of all clusters.

- 'size_distribution': Frequency mapping of {size: count}.

- 'labels': 2D array of unique cluster IDs.

- 'size_dict': Mapping of {label_id: size}.

Examples

--------

>>> stats = get_cluster_stats_fast(grid, species=1)

>>> print(f"Found {stats['n_clusters']} prey clusters.")

>>> print(f"Order parameter: {stats['largest_fraction']:.3f}")

"""

labels, size_dict = detect_clusters_fast(grid, species, neighborhood)

if len(size_dict) == 0:

return {

"n_clusters": 0,

"sizes": np.array([], dtype=np.int32),

"largest": 0,

"largest_fraction": 0.0,

"mean_size": 0.0,

"size_distribution": {},

"labels": labels,

"size_dict": size_dict,

}

sizes = np.array(list(size_dict.values()), dtype=np.int32)

sizes_sorted = np.sort(sizes)[::-1]

total_pop = int(np.sum(sizes))

largest = int(sizes_sorted[0])

size_dist = {}

for s in sizes:

s_int = int(s)

size_dist[s_int] = size_dist.get(s_int, 0) + 1

return {

"n_clusters": len(size_dict),

"sizes": sizes_sorted,

"largest": largest,

"largest_fraction": float(largest) / total_pop if total_pop > 0 else 0.0,

"mean_size": float(np.mean(sizes)),

"size_distribution": size_dist,

"labels": labels,

"size_dict": size_dict,

}

# ============================================================================

# PCF COMPUTATION (Cell-list accelerated)

# ============================================================================

@njit(cache=True)

def _build_cell_list(

positions: np.ndarray,

n_cells: int,

L_row: float,

L_col: float,

) -> Tuple[np.ndarray, np.ndarray, np.ndarray, float, float]:

"""

Build a cell list for spatial hashing to accelerate neighbor lookups.

This Numba-optimized function partitions a set of coordinates into a

grid of cells. It uses a three-pass approach to calculate cell occupancy,

compute starting offsets for each cell in a flat index array, and finally

populate that array with position indices.

Parameters

----------

positions : np.ndarray

An (N, 2) float array of coordinates within the simulation domain.

n_cells : int

The number of cells along one dimension of the square grid.

L_row : float

The total height (row extent) of the simulation domain.

L_col : float

The total width (column extent) of the simulation domain.

Returns

-------

indices : np.ndarray

A 1D array of original position indices, reordered so that indices

belonging to the same cell are contiguous.

offsets : np.ndarray

A 2D array where `offsets[r, c]` is the starting index in the

`indices` array for cell (r, c).

cell_counts : np.ndarray

A 2D array where `cell_counts[r, c]` is the number of points

contained in cell (r, c).

cell_size_r : float

The calculated height of an individual cell.

cell_size_c : float

The calculated width of an individual cell.

Notes

-----

This implementation assumes periodic boundary conditions via the

modulo operator during coordinate-to-cell mapping. It is designed to

eliminate heap allocations within the main simulation loop by using

Numba's efficient array handling.

"""

n_pos = len(positions)

cell_size_r = L_row / n_cells

cell_size_c = L_col / n_cells

View remainder of file in raw view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

numba_optimized.py

Latest commit

History

numba_optimized.py

File metadata and controls