notes/docs/python/index.xml at master · chenweichang/notes

History

1581 lines (1238 loc) · 176 KB

Raw

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

991

992

993

994

995

996

997

998

999

1000

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>

<title>Pythons on Chris Albon</title>

<description>Recent content in Pythons on Chris Albon</description>

<generator>Hugo -- gohugo.io</generator>

<atom:link href="https://chrisalbon.com/python/index.xml" rel="self" type="application/rss+xml" />

<item>

<title>Add Padding Around String</title>

<guid>https://chrisalbon.com/python/basics/add_padding_around_string/</guid>

<description> Create Some Text text = &#39;Chapter 1&#39; Add Padding Around Text # Add Spaces Of Padding To The Left format(text, &#39;&gt;20&#39;) ' Chapter 1' # Add Spaces Of Padding To The Right format(text, &#39;&lt;20&#39;) 'Chapter 1 ' # Add Spaces Of Padding On Each Side format(text, &#39;^20&#39;) ' Chapter 1 ' # Add * Of Padding On Each Side format(text, &#39;*^20&#39;) '*****Chapter 1******' </description>

</item>

<item>

<title>All Combinations For A List Of Objects</title>

<guid>https://chrisalbon.com/python/basics/all_combinations_of_a_list_of_objects/</guid>

<description>Preliminary # Import combinations with replacements from itertools from itertools import combinations_with_replacement Create a list of objects # Create a list of objects to combine list_of_objects = [&#39;warplanes&#39;, &#39;armor&#39;, &#39;infantry&#39;] Find all combinations (with replacement) for the list # Create an empty list object to hold the results of the loop combinations = [] # Create a loop for every item in the length of list_of_objects, that, for i in list(range(len(list_of_objects))): # Finds every combination (with replacement) for each object in the list combinations.</description>

</item>

<item>

<title>Apply Functions By Group In Pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_apply_function_by_group/</guid>

<description>Preliminaries # import pandas as pd import pandas as pd Create a simulated dataset # Create an example dataframe data = {&#39;Platoon&#39;: [&#39;A&#39;,&#39;A&#39;,&#39;A&#39;,&#39;A&#39;,&#39;A&#39;,&#39;A&#39;,&#39;B&#39;,&#39;B&#39;,&#39;B&#39;,&#39;B&#39;,&#39;B&#39;,&#39;C&#39;,&#39;C&#39;,&#39;C&#39;,&#39;C&#39;,&#39;C&#39;], &#39;Casualties&#39;: [1,4,5,7,5,5,6,1,4,5,6,7,4,6,4,6]} df = pd.DataFrame(data) df Casualties Platoon 0 1 A 1 4 A 2 5 A 3 7 A 4 5 A 5 5 A 6 6 B 7 1 B 8 4 B 9 5 B 10 6 B 11 7 C 12 4 C 13 6 C 14 4 C 15 6 C Apply A Function (Rolling Mean) To The DataFrame, By Group # Group df by df.</description>

</item>

<item>

<title>Apply Operations Over Items In A List</title>

<guid>https://chrisalbon.com/python/basics/apply_operations_over_items_in_lists/</guid>

<description>Method 1: map() # Create a list of casualties from battles battleDeaths = [482, 93, 392, 920, 813, 199, 374, 237, 244]# Create a function that updates all battle deaths by adding 100 def updated(x): return x + 100# Create a list that applies updated() to all elements of battleDeaths list(map(updated, battleDeaths)) [582, 193, 492, 1020, 913, 299, 474, 337, 344] Method 2: for x in y # Create a list of deaths casualties = [482, 93, 392, 920, 813, 199, 374, 237, 244]# Create a variable where we will put the updated casualty numbers casualtiesUpdated = []# Create a function that for each item in casualties, adds 10 for x in casualties: casualtiesUpdated.</description>

</item>

<item>

<title>Apply Operations To Groups In Pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_apply_operations_to_groups/</guid>

<description>Preliminaries # import modules import pandas as pd# Create dataframe raw_data = {&#39;regiment&#39;: [&#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;], &#39;company&#39;: [&#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;, &#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;,&#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;], &#39;name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;, &#39;Jacon&#39;, &#39;Ryaner&#39;, &#39;Sone&#39;, &#39;Sloan&#39;, &#39;Piger&#39;, &#39;Riani&#39;, &#39;Ali&#39;], &#39;preTestScore&#39;: [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]} df = pd.</description>

</item>

<item>

<title>Applying Functions To List Items</title>

<guid>https://chrisalbon.com/python/basics/applying_functions_to_list_items/</guid>

<description>Create a list of regiment names regimentNames = [&#39;Night Riflemen&#39;, &#39;Jungle Scouts&#39;, &#39;The Dragoons&#39;, &#39;Midnight Revengence&#39;, &#39;Wily Warriors&#39;] Using A For Loop Create a for loop goes through the list and capitalizes each # create a variable for the for loop results regimentNamesCapitalized_f = [] # for every item in regimentNames for i in regimentNames: # capitalize the item and add it to regimentNamesCapitalized_f regimentNamesCapitalized_f.append(i.upper()) # View the outcome regimentNamesCapitalized_f ['NIGHT RIFLEMEN', 'JUNGLE SCOUTS', 'THE DRAGOONS', 'MIDNIGHT REVENGENCE', 'WILY WARRIORS'] Using Map() Create a lambda function that capitalizes x capitalizer = lambda x: x.</description>

</item>

<item>

<title>Applying Operations Over pandas Dataframes</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_apply_operations_to_dataframes/</guid>

<description>Import Modules import pandas as pd import numpy as np Create a dataframe data = {&#39;name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;year&#39;: [2012, 2012, 2013, 2014, 2014], &#39;reports&#39;: [4, 24, 31, 2, 3], &#39;coverage&#39;: [25, 94, 57, 62, 70]} df = pd.DataFrame(data, index = [&#39;Cochice&#39;, &#39;Pima&#39;, &#39;Santa Cruz&#39;, &#39;Maricopa&#39;, &#39;Yuma&#39;]) df .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } coverage name reports year Cochice 25 Jason 4 2012 Pima 94 Molly 24 2012 Santa Cruz 57 Tina 31 2013 Maricopa 62 Jake 2 2014 Yuma 70 Amy 3 2014 Create a capitalization lambda function capitalizer = lambda x: x.</description>

</item>

<item>

<title>Arithmetic Basics</title>

<guid>https://chrisalbon.com/python/basics/arithmetic_basics/</guid>

<description>Create some simulated variables x = 6 y = 9 x plus y x + y 15 x minus y x - y -3 x times y x * y 54 the remainder of x divided by y x % y 6 x divided by y x / y 0.6666666666666666 x divided by y (floor) (i.e. the quotient) x // y 0 x raised to the y power x ** y 10077696 x plus y, then divide by x (x + y) / x 2.</description>

</item>

<item>

<title>Assign A New Column To A Pandas DataFrame</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_assign_new_column_dataframe/</guid>

<description> Preliminaries import pandas as pd Create Dataframe # Create empty dataframe df = pd.DataFrame() # Create a column df[&#39;name&#39;] = [&#39;John&#39;, &#39;Steve&#39;, &#39;Sarah&#39;] # View dataframe df name 0 John 1 Steve 2 Sarah Assign New Column To Dataframe # Assign a new column to df called &#39;age&#39; with a list of ages df.assign(age = [31, 32, 19]) name age 0 John 31 1 Steve 32 2 Sarah 19 </description>

</item>

<item>

<title>Assignment Operators</title>

<guid>https://chrisalbon.com/python/basics/assignment_operators/</guid>

<description>Create some variables a = 2 b = 1 c = 0 d = 3 Assigns values from right side to left side c = a + b c 3 Add right to the left and assign the result to left (c = a + c) c += a c 5 Subtract right from the left and assign the result to left (c = a - c) c -= a c 3 Multiply right with the left and assign the result to left (c = a * c) c *= a c 6 Divide left with the right and assign the result to left (c = c / a) c /= a c 3.</description>

</item>

<item>

<title>Back To Back Bar Plot In MatPlotLib</title>

<guid>https://chrisalbon.com/python/data_visualization/matplotlib_back_to_back_bar_plot/</guid>

<description>Preliminaries %matplotlib inline import pandas as pd import matplotlib.pyplot as plt import numpy as np Create dataframe raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;pre_score&#39;: [4, 24, 31, 2, 3], &#39;mid_score&#39;: [25, 94, 57, 62, 70], &#39;post_score&#39;: [5, 43, 23, 23, 51]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;pre_score&#39;, &#39;mid_score&#39;, &#39;post_score&#39;]) df first_name pre_score mid_score post_score 0 Jason 4 25 5 1 Molly 24 94 43 2 Tina 31 57 23 3 Jake 2 62 23 4 Amy 3 70 51 Make plot # input data, specifically the second and # third rows, skipping the first column x1 = df.</description>

</item>

<item>

<title>Bar Plot In MatPlotLib</title>

<guid>https://chrisalbon.com/python/data_visualization/matplotlib_bar_plot/</guid>

<description>Preliminaries %matplotlib inline import pandas as pd import matplotlib.pyplot as plt import numpy as np Create dataframe raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;pre_score&#39;: [4, 24, 31, 2, 3], &#39;mid_score&#39;: [25, 94, 57, 62, 70], &#39;post_score&#39;: [5, 43, 23, 23, 51]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;pre_score&#39;, &#39;mid_score&#39;, &#39;post_score&#39;]) df first_name pre_score mid_score post_score 0 Jason 4 25 5 1 Molly 24 94 43 2 Tina 31 57 23 3 Jake 2 62 23 4 Amy 3 70 51 Make plot # Create a list of the mean scores for each variable mean_values = [df[&#39;pre_score&#39;].</description>

</item>

<item>

<title>Basic Operations With NumPy Array</title>

<guid>https://chrisalbon.com/python/basics/numpy_array_basic_operations/</guid>

<description># Import modules import numpy as np# Create an array civilian_deaths = np.array([4352, 233, 3245, 256, 2394]) civilian_deaths array([4352, 233, 3245, 256, 2394]) # Mean value of the array civilian_deaths.mean() 2096.0 # Total amount of deaths civilian_deaths.sum() 10480 # Smallest value in the array civilian_deaths.min() 233 # Largest value in the array civilian_deaths.max() 4352 </description>

</item>

<item>

<title>Beautiful Soup Basic HTML Scraping</title>

<guid>https://chrisalbon.com/python/web_scraping/beautiful_soup_html_basics/</guid>

<description>Import the modules # Import required modules import requests from bs4 import BeautifulSoup Scrap the html and turn into a beautiful soup object # Create a variable with the url url = &#39;http://chrisralbon.com&#39; # Use requests to get the contents r = requests.get(url) # Get the text of the contents html_content = r.text # Convert the html content into a beautiful soup object soup = BeautifulSoup(html_content, &#39;lxml&#39;) Select the website&rsquo;s title # View the title tag of the soup object soup.</description>

</item>

<item>

<title>Break A List Into N-Sized Chunks</title>

<guid>https://chrisalbon.com/python/data_wrangling/break_list_into_chunks_of_equal_size/</guid>

<description>In this snippet we take a list and break it up into n-size chunks. This is a very common practice when dealing with APIs that have a maximum request size.

Credit for this nifty function goes to Ned Batchelder who posted it on StackOverflow.

# Create a list of first names first_names = [&#39;Steve&#39;, &#39;Jane&#39;, &#39;Sara&#39;, &#39;Mary&#39;,&#39;Jack&#39;,&#39;Bob&#39;, &#39;Bily&#39;, &#39;Boni&#39;, &#39;Chris&#39;,&#39;Sori&#39;, &#39;Will&#39;, &#39;Won&#39;,&#39;Li&#39;]# Create a function called &#34;chunks&#34; with two arguments, l and n: def chunks(l, n): # For item i in a range that is a length of l, for i in range(0, len(l), n): # Create an index range for l of n items: yield l[i:i+n]# Create a list that from the results of the function chunks: list(chunks(first_names, 5)) [['Steve', 'Jane', 'Sara', 'Mary', 'Jack'], ['Bob', 'Bily', 'Boni', 'Chris', 'Sori'], ['Will', 'Won', 'Li']] </description>

</item>

<item>

<title>Breaking Up A String Into Columns Using Regex In pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_regex_to_create_columns/</guid>

<description>Import modules import re import pandas as pd Create a dataframe of raw strings # Create a dataframe with a single column of strings data = {&#39;raw&#39;: [&#39;Arizona 1 2014-12-23 3242.0&#39;, &#39;Iowa 1 2010-02-23 3453.7&#39;, &#39;Oregon 0 2014-06-20 2123.0&#39;, &#39;Maryland 0 2014-03-14 1123.6&#39;, &#39;Florida 1 2013-01-15 2134.0&#39;, &#39;Georgia 0 2012-07-14 2345.6&#39;]} df = pd.DataFrame(data, columns = [&#39;raw&#39;]) df raw 0 Arizona 1 2014-12-23 3242.</description>

</item>

<item>

<title>Breaking Up String Variables</title>

<guid>https://chrisalbon.com/python/basics/breaking_up_string_variables/</guid>

<description> Basic name assignment variableName = &#39;This is a string.&#39; List assignment One, Two, Three = [1, 2, 3] Break up a string into variables firstLetter, secondLetter, thirdLetter, fourthLetter = &#39;Bark&#39;firstLetter 'B' secondLetter 'a' thirdLetter 'r' fourthLetter 'k' Breaking up a number into separate variables firstNumber, secondNumber, thirdNumber, fourthNumber = &#39;9485&#39;firstNumber '9' secondNumber '4' thirdNumber '8' fourthNumber '5' Assign the first letter of &lsquo;spam&rsquo; into varible a, assign all the remaining letters to variable b a, *b = &#39;spam&#39; a 's' b ['p', 'a', 'm'] </description>

</item>

<item>

<title>Brute Force D20 Roll Simulator</title>

<guid>https://chrisalbon.com/python/basics/brute_force_d20_simulator/</guid>

<description>This snippet is a completely inefficient simulator of a 20 sided dice. To create a &ldquo;successful roll&rdquo; the snippet has to generate dozens of random numbers.

Import random module import random Create a variable with a TRUE value rolling = True Create a while loop that rolls until the first digit is 2 or less and the second digit is 10 or less # while rolling is true while rolling: # create x, a random number between 0 and 99 x = random.</description>

</item>

<item>

<title>Cartesian Product</title>

<guid>https://chrisalbon.com/python/basics/cartesian_product/</guid>

<description>Preliminaries # import pandas as pd import pandas as pd Create Data # Create two lists i = [1,2,3,4,5] j = [1,2,3,4,5] Calculate Cartesian Product (Method 1) # List every single x in i with every single y (i.e. Cartesian product) [(x, y) for x in i for y in j] [(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)] Calculate Cartesian Product (Method 2) # An alternative way to do the cartesian product # import itertools import itertools # for two sets, find the the cartisan product for i in itertools.</description>

</item>

<item>

<title>Chain Together Lists</title>

<guid>https://chrisalbon.com/python/basics/chain_together_lists/</guid>

<description> Preliminaries from itertools import chain Create Two Lists # Create a list of allies allies = [&#39;Spain&#39;, &#39;Germany&#39;, &#39;Namibia&#39;, &#39;Austria&#39;] # Create a list of enemies enemies = [&#39;Mexico&#39;, &#39;United Kingdom&#39;, &#39;France&#39;] Iterate Over Both Lists As A Single Sequence # For each country in allies and enemies for country in chain(allies, enemies): # print the country print(country) Spain Germany Namibia Austria Mexico United Kingdom France </description>

</item>

<item>

<title>Cleaning Text</title>

<guid>https://chrisalbon.com/python/basics/cleaning_text/</guid>

<description>Create some raw text # Create a list of three strings. incoming_reports = [&#34;We are attacking on their left flank but are losing many men.&#34;, &#34;We cannot see the enemy army. Nothing else to report.&#34;, &#34;We are ready to attack but are waiting for your orders.&#34;] Seperate by word # import word tokenizer from nltk.tokenize import word_tokenize # Apply word_tokenize to each element of the list called incoming_reports tokenized_reports = [word_tokenize(report) for report in incoming_reports] # View tokenized_reports tokenized_reports [['We', 'are', 'attacking', 'on', 'their', 'left', 'flank', 'but', 'are', 'losing', 'many', 'men', '.</description>

</item>

<item>

<title>Color Palettes in Seaborn</title>

<guid>https://chrisalbon.com/python/data_visualization/seaborn_color_palettes/</guid>

<description>Preliminaries import pandas as pd %matplotlib inline import matplotlib.pyplot as plt import seaborn as snsdata = {&#39;date&#39;: [&#39;2014-05-01 18:47:05.069722&#39;, &#39;2014-05-01 18:47:05.119994&#39;, &#39;2014-05-02 18:47:05.178768&#39;, &#39;2014-05-02 18:47:05.230071&#39;, &#39;2014-05-02 18:47:05.230071&#39;, &#39;2014-05-02 18:47:05.280592&#39;, &#39;2014-05-03 18:47:05.332662&#39;, &#39;2014-05-03 18:47:05.385109&#39;, &#39;2014-05-04 18:47:05.436523&#39;, &#39;2014-05-04 18:47:05.486877&#39;], &#39;deaths_regiment_1&#39;: [34, 43, 14, 15, 15, 14, 31, 25, 62, 41], &#39;deaths_regiment_2&#39;: [52, 66, 78, 15, 15, 5, 25, 25, 86, 1], &#39;deaths_regiment_3&#39;: [13, 73, 82, 58, 52, 87, 26, 5, 56, 75], &#39;deaths_regiment_4&#39;: [44, 75, 26, 15, 15, 14, 54, 25, 24, 72], &#39;deaths_regiment_5&#39;: [25, 24, 25, 15, 57, 68, 21, 27, 62, 5], &#39;deaths_regiment_6&#39;: [84, 84, 26, 15, 15, 14, 26, 25, 62, 24], &#39;deaths_regiment_7&#39;: [46, 57, 26, 15, 15, 14, 26, 25, 62, 41]} df = pd.</description>

</item>

<item>

<title>Compare Two Dictionaries</title>

<guid>https://chrisalbon.com/python/basics/compare_two_dictionaries/</guid>

<description>Make Two Dictionaries importers = {&#39;El Salvador&#39; : 1234, &#39;Nicaragua&#39; : 152, &#39;Spain&#39; : 252 } exporters = {&#39;Spain&#39; : 252, &#39;Germany&#39; : 251, &#39;Italy&#39; : 1563 } Find Duplicate Keys # Find the intersection of importers and exporters importers.keys() &amp; exporters.keys() {'Spain'} Find Difference In Keys # Find the difference between importers and exporters importers.keys() - exporters.keys() {'El Salvador', 'Nicaragua'} Find Key, Values Pairs In Common # Find countries where the amount of exports matches the amount of imports importers.</description>

</item>

<item>

<title>Concurrent Processing</title>

<guid>https://chrisalbon.com/python/basics/concurrent_processing/</guid>

<description> Preliminaries from concurrent import futures Create Data data = range(100) Create Function # Create some function that takes a value def some_function(value): # And outputs it raised to its own power return value**value Run The Function On The Data Concurrently # With a pool of workers with futures.ProcessPoolExecutor() as executor: # Map the function to the data result = executor.map(some_function, data) View Results # List the first 5 outputs list(result)[0:5] [1, 1, 4, 27, 256] </description>

</item>

<item>

<title>Construct A Dictionary From Multiple Lists</title>

<guid>https://chrisalbon.com/python/data_wrangling/construct_a_dictionary_from_multiple_lists/</guid>

<description> Create Two Lists # Create a list of theofficer&#39;s name officer_names = [&#39;Sodoni Dogla&#39;, &#39;Chris Jefferson&#39;, &#39;Jessica Billars&#39;, &#39;Michael Mulligan&#39;, &#39;Steven Johnson&#39;] # Create a list of the officer&#39;s army officer_armies = [&#39;Purple Army&#39;, &#39;Orange Army&#39;, &#39;Green Army&#39;, &#39;Red Army&#39;, &#39;Blue Army&#39;] Construct A Dictionary From The Two Lists # Create a dictionary that is the zip of the two lists dict(zip(officer_names, officer_armies)) {'Chris Jefferson': 'Orange Army', 'Jessica Billars': 'Green Army', 'Michael Mulligan': 'Red Army', 'Sodoni Dogla': 'Purple Army', 'Steven Johnson': 'Blue Army'} </description>

</item>

<item>

<title>Continue And Break Loops</title>

<guid>https://chrisalbon.com/python/basics/continue_and_break_loops/</guid>

<description>Import the random module import random Create a while loop # set running to true running = True# while running is true while running: # Create a random integer between 0 and 5 s = random.randint(0,5) # If the integer is less than 3 if s &lt; 3: # Print this print(&#39;It is too small, starting over.&#39;) # Reset the next interation of the loop # (i.e skip everything below and restart from the top) continue # If the integer is 4 if s == 4: running = False # Print this print(&#39;It is 4!</description>

</item>

<item>

<title>Convert A CSV Into Python Code To Recreate It</title>

<guid>https://chrisalbon.com/python/data_wrangling/csv_to_python_code/</guid>

<description>Preliminaries # Import the pandas package import pandas as pd Load the external dataset # Load the csv file as a pandas dataframe df_original = pd.read_csv(&#39;http://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv&#39;) df = pd.read_csv(&#39;http://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv&#39;) Print the code required to create that dataset # Print the code to create the dataframe print(&#39;==============================&#39;) print(&#39;RUN THE CODE BELOW THIS LINE&#39;) print(&#39;==============================&#39;) print(&#39;raw_data =&#39;, df.to_dict(orient=&#39;list&#39;)) print(&#39;df = pd.DataFrame(raw_data, columns = &#39; + str(list(df_original)) + &#39;)&#39;) ============================== RUN THE CODE BELOW THIS LINE ============================== raw_data = {'Sepal.</description>

</item>

<item>

<title>Convert A Categorical Variable Into Dummy Variables</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_convert_categorical_to_dummies/</guid>

<description># import modules import pandas as pd# Create a dataframe raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;last_name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;], &#39;sex&#39;: [&#39;male&#39;, &#39;female&#39;, &#39;male&#39;, &#39;female&#39;, &#39;female&#39;]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;last_name&#39;, &#39;sex&#39;]) df first_name last_name sex 0 Jason Miller male 1 Molly Jacobson female 2 Tina Ali male 3 Jake Milner female 4 Amy Cooze female # Create a set of dummy variables from the sex variable df_sex = pd.</description>

</item>

<item>

<title>Convert A Categorical Variable Into Dummy Variables</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_convert_numeric_categorical_to_numeric_with_patsy/</guid>

<description># import modules import pandas as pd import patsy# Create dataframe raw_data = {&#39;countrycode&#39;: [1, 2, 3, 2, 1]} df = pd.DataFrame(raw_data, columns = [&#39;countrycode&#39;]) df countrycode 0 1 1 2 2 3 3 2 4 1 # Convert the countrycode variable into three binary variables patsy.dmatrix(&#39;C(countrycode)-1&#39;, df, return_type=&#39;dataframe&#39;) C(countrycode)[1] C(countrycode)[2] C(countrycode)[3] 0 1.</description>

</item>

<item>

<title>Convert A String Categorical Variable To A Numeric Variable</title>

<guid>https://chrisalbon.com/python/data_wrangling/convert_categorical_to_numeric/</guid>

<description>import modules import pandas as pd Create dataframe raw_data = {&#39;patient&#39;: [1, 1, 1, 2, 2], &#39;obs&#39;: [1, 2, 3, 1, 2], &#39;treatment&#39;: [0, 1, 0, 1, 0], &#39;score&#39;: [&#39;strong&#39;, &#39;weak&#39;, &#39;normal&#39;, &#39;weak&#39;, &#39;strong&#39;]} df = pd.DataFrame(raw_data, columns = [&#39;patient&#39;, &#39;obs&#39;, &#39;treatment&#39;, &#39;score&#39;]) df patient obs treatment score 0 1 1 0 strong 1 1 2 1 weak 2 1 3 0 normal 3 2 1 1 weak 4 2 2 0 strong Create a function that converts all values of df['score'] into numbers def score_to_numeric(x): if x==&#39;strong&#39;: return 3 if x==&#39;normal&#39;: return 2 if x==&#39;weak&#39;: return 1 Apply the function to the score variable df[&#39;score_num&#39;] = df[&#39;score&#39;].</description>

</item>

<item>

<title>Convert A Variable To A Time Variable In pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_convert_to_datetime/</guid>

<description># Import Preliminaries import pandas as pd# Create a dataset with the index being a set of names raw_data = {&#39;date&#39;: [&#39;2014-06-01T01:21:38.004053&#39;, &#39;2014-06-02T01:21:38.004053&#39;, &#39;2014-06-03T01:21:38.004053&#39;], &#39;score&#39;: [25, 94, 57]} df = pd.DataFrame(raw_data, columns = [&#39;date&#39;, &#39;score&#39;]) df date score 0 2014-06-01T01:21:38.004053 25 1 2014-06-02T01:21:38.004053 94 2 2014-06-03T01:21:38.004053 57 # Transpose the dataset, so that the index (in this case the names) are columns df[&#34;date&#34;] = pd.</description>

</item>

<item>

<title>Convert HTML Characters To Strings</title>

<guid>https://chrisalbon.com/python/basics/convert_html_symbols_to_strings/</guid>

<description>## Preliminariesimport html## Create Texttext = &#39;This item costs ¥400 or £4.&#39;## Convert To Stringhtml.unescape(text) 'This item costs ¥400 or £4.' ## Convert To HTML Entitieshtml.escape(text) 'This item costs &amp;amp;#165;400 or &amp;amp;#163;4.' </description>

</item>

<item>

<title>Converting Strings To Datetime</title>

<guid>https://chrisalbon.com/python/basics/strings_to_datetime/</guid>

<description>Import modules from datetime import datetime from dateutil.parser import parse import pandas as pd Create a string variable with the war start time war_start = &#39;2011-01-03&#39; Convert the string to datetime format datetime.strptime(war_start, &#39;%Y-%m-%d&#39;) datetime.datetime(2011, 1, 3, 0, 0) Create a list of strings as dates attack_dates = [&#39;7/2/2011&#39;, &#39;8/6/2012&#39;, &#39;11/13/2013&#39;, &#39;5/26/2011&#39;, &#39;5/2/2001&#39;] Convert attack_dates strings into datetime format [datetime.strptime(x, &#39;%m/%d/%Y&#39;) for x in attack_dates] [datetime.datetime(2011, 7, 2, 0, 0), datetime.</description>

</item>

<item>

<title>Count Values In Pandas Dataframe</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_dataframe_count_values/</guid>

<description>Import the pandas module import pandas as pd Create all the columns of the dataframe as series year = pd.Series([1875, 1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890, 1891, 1892, 1893, 1894]) guardCorps = pd.Series([0,2,2,1,0,0,1,1,0,3,0,2,1,0,0,1,0,1,0,1]) corps1 = pd.Series([0,0,0,2,0,3,0,2,0,0,0,1,1,1,0,2,0,3,1,0]) corps2 = pd.Series([0,0,0,2,0,2,0,0,1,1,0,0,2,1,1,0,0,2,0,0]) corps3 = pd.Series([0,0,0,1,1,1,2,0,2,0,0,0,1,0,1,2,1,0,0,0]) corps4 = pd.Series([0,1,0,1,1,1,1,0,0,0,0,1,0,0,0,0,1,1,0,0]) corps5 = pd.Series([0,0,0,0,2,1,0,0,1,0,0,1,0,1,1,1,1,1,1,0]) corps6 = pd.Series([0,0,1,0,2,0,0,1,2,0,1,1,3,1,1,1,0,3,0,0]) corps7 = pd.Series([1,0,1,0,0,0,1,0,1,1,0,0,2,0,0,2,1,0,2,0]) corps8 = pd.Series([1,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,1,1,0,1]) corps9 = pd.Series([0,0,0,0,0,2,1,1,1,0,2,1,1,0,1,2,0,1,0,0]) corps10 = pd.</description>

</item>

<item>

<title>Create A New File Then Write To It</title>

<guid>https://chrisalbon.com/python/basics/create_a_new_file_and_the_write_to_it/</guid>

<description>Create A New File And Write To It # Create a file if it doesn&#39;t already exist with open(&#39;file.txt&#39;, &#39;xt&#39;) as f: # Write to the file f.write(&#39;This file now exsits!&#39;) # Close the connection to the file f.close() Open The File And Read It # Open the file with open(&#39;file.txt&#39;, &#39;rt&#39;) as f: # Read the data in the file data = f.read() # Close the connection to the file f.</description>

</item>

<item>

<title>Create A Pipeline In Pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_create_pipeline/</guid>

<description>Pandas&rsquo; pipeline feature allows you to string together Python functions in order to build a pipeline of data processing.

Preliminaries import pandas as pd Create Dataframe # Create empty dataframe df = pd.DataFrame() # Create a column df[&#39;name&#39;] = [&#39;John&#39;, &#39;Steve&#39;, &#39;Sarah&#39;] df[&#39;gender&#39;] = [&#39;Male&#39;, &#39;Male&#39;, &#39;Female&#39;] df[&#39;age&#39;] = [31, 32, 19] # View dataframe df name gender age 0 John Male 31 1 Steve Male 32 2 Sarah Female 19 Create Functions To Process Data # Create a function that def mean_age_by_group(dataframe, col): # groups the data by a column and returns the mean age per group return dataframe.</description>

</item>

<item>

<title>Create A Temporary File</title>

<guid>https://chrisalbon.com/python/basics/create_a_temporary_file/</guid>

<description>Preliminaries from tempfile import NamedTemporaryFile Create A Temporary File f = NamedTemporaryFile(&#39;w+t&#39;) Write To The Temp File # Write to the file, the output is the number of characters f.write(&#39;Nobody lived on Deadweather but us and the pirates. It wasn’t hard to understand why.&#39;) 85 View The Tmp File&rsquo;s Name f.name '/var/folders/0b/pj3wsd750fjf8xzfb0n127w80000gn/T/tmphv1dkovx' Read The File # Go to the top of the file f.seek(0) # Read the file f.</description>

</item>

<item>

<title>Create A pandas Column With A For Loop</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_create_column_with_loop/</guid>

<description>Preliminaries import pandas as pd import numpy as np Create an example dataframe raw_data = {&#39;student_name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;, &#39;Jacon&#39;, &#39;Ryaner&#39;, &#39;Sone&#39;, &#39;Sloan&#39;, &#39;Piger&#39;, &#39;Riani&#39;, &#39;Ali&#39;], &#39;test_score&#39;: [76, 88, 84, 67, 53, 96, 64, 91, 77, 73, 52, np.NaN]} df = pd.DataFrame(raw_data, columns = [&#39;student_name&#39;, &#39;test_score&#39;]) Create a function to assign letter grades # Create a list to store the data grades = [] # For each row in the column, for row in df[&#39;test_score&#39;]: # if more than a value, if row &gt; 95: # Append a letter grade grades.</description>

</item>

<item>

<title>Create Counts Of Items</title>

<guid>https://chrisalbon.com/python/data_wrangling/creating_counts_of_items/</guid>

<description>Preliminaries from collections import Counter Create A Counter # Create a counter of the fruits eaten today fruit_eaten = Counter([&#39;Apple&#39;, &#39;Apple&#39;, &#39;Apple&#39;, &#39;Banana&#39;, &#39;Pear&#39;, &#39;Pineapple&#39;]) # View counter fruit_eaten Counter({'Apple': 3, 'Banana': 1, 'Pear': 1, 'Pineapple': 1}) Update The Count For An Element # Update the count for &#39;Pineapple&#39; (because you just ate an pineapple) fruit_eaten.update([&#39;Pineapple&#39;]) # View the counter fruit_eaten Counter({'Apple': 3, 'Banana': 1, 'Pear': 1, 'Pineapple': 2}) View The Items With The Highest Counts # View the items with the top 3 counts fruit_eaten.</description>

</item>

<item>

<title>Create a Column Based on a Conditional in pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_create_column_using_conditional/</guid>

<description>Preliminaries # Import required modules import pandas as pd import numpy as np Make a dataframe data = {&#39;name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;age&#39;: [42, 52, 36, 24, 73], &#39;preTestScore&#39;: [4, 24, 31, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70]} df = pd.DataFrame(data, columns = [&#39;name&#39;, &#39;age&#39;, &#39;preTestScore&#39;, &#39;postTestScore&#39;]) df name age preTestScore postTestScore 0 Jason 42 4 25 1 Molly 52 24 94 2 Tina 36 31 57 3 Jake 24 2 62 4 Amy 73 3 70 Add a new column for elderly # Create a new column called df.</description>

</item>

<item>

<title>Creating A Time Series Plot With Seaborn And pandas</title>

<guid>https://chrisalbon.com/python/data_visualization/seaborn_pandas_timeseries_plot/</guid>

</item>

<item>

<title>Creating Lists From Dictionary Keys And Values</title>

<guid>https://chrisalbon.com/python/data_wrangling/create_list_from_dictionary_keys_and_values/</guid>

<description> Create a dictionary dict = {&#39;county&#39;: [&#39;Cochice&#39;, &#39;Pima&#39;, &#39;Santa Cruz&#39;, &#39;Maricopa&#39;, &#39;Yuma&#39;], &#39;year&#39;: [2012, 2012, 2013, 2014, 2014], &#39;fireReports&#39;: [4, 24, 31, 2, 3]} Create a list from the dictionary keys # Create a list of keys list(dict.keys()) ['fireReports', 'year', 'county'] Create a list from the dictionary values # Create a list of values list(dict.values()) [[4, 24, 31, 2, 3], [2012, 2012, 2013, 2014, 2014], ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma']] </description>

</item>

<item>

<title>Creating Scatterplots With Seaborn</title>

<guid>https://chrisalbon.com/python/data_visualization/seaborn_scatterplot/</guid>

<description>Preliminaries import pandas as pd %matplotlib inline import random import matplotlib.pyplot as plt import seaborn as sns Create data # Create empty dataframe df = pd.DataFrame() # Add columns df[&#39;x&#39;] = random.sample(range(1, 1000), 5) df[&#39;y&#39;] = random.sample(range(1, 1000), 5) df[&#39;z&#39;] = [1,0,0,1,0] df[&#39;k&#39;] = [&#39;male&#39;,&#39;male&#39;,&#39;male&#39;,&#39;female&#39;,&#39;female&#39;]# View first few rows of data df.head() .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .</description>

</item>

<item>

<title>Crosstabs In pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_crosstabs/</guid>

<description>Import pandas import pandas as pdraw_data = {&#39;regiment&#39;: [&#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;], &#39;company&#39;: [&#39;infantry&#39;, &#39;infantry&#39;, &#39;cavalry&#39;, &#39;cavalry&#39;, &#39;infantry&#39;, &#39;infantry&#39;, &#39;cavalry&#39;, &#39;cavalry&#39;,&#39;infantry&#39;, &#39;infantry&#39;, &#39;cavalry&#39;, &#39;cavalry&#39;], &#39;experience&#39;: [&#39;veteran&#39;, &#39;rookie&#39;, &#39;veteran&#39;, &#39;rookie&#39;, &#39;veteran&#39;, &#39;rookie&#39;, &#39;veteran&#39;, &#39;rookie&#39;,&#39;veteran&#39;, &#39;rookie&#39;, &#39;veteran&#39;, &#39;rookie&#39;], &#39;name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;, &#39;Jacon&#39;, &#39;Ryaner&#39;, &#39;Sone&#39;, &#39;Sloan&#39;, &#39;Piger&#39;, &#39;Riani&#39;, &#39;Ali&#39;], &#39;preTestScore&#39;: [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]} df = pd.</description>

</item>

<item>

<title>Data Structure Basics</title>

<guid>https://chrisalbon.com/python/basics/data_structure_basics/</guid>

<description>Lists &ldquo;A list is a data structure that holds an ordered collection of items i.e. you can store a sequence of items in a list.&rdquo; - A Byte Of Python

Lists are mutable.

# Create a list of countries, then print the results allies = [&#39;USA&#39;,&#39;UK&#39;,&#39;France&#39;,&#39;New Zealand&#39;, &#39;Australia&#39;,&#39;Canada&#39;,&#39;Poland&#39;]; allies ['USA', 'UK', 'France', 'New Zealand', 'Australia', 'Canada', 'Poland'] # Print the length of the list len(allies) 7 # Add an item to the list, then print the results allies.</description>

</item>

<item>

<title>Date And Time Basics</title>

<guid>https://chrisalbon.com/python/basics/date_and_time_basics/</guid>

<description># Import modules from datetime import datetime from datetime import timedelta# Create a variable with the current time now = datetime.now() now datetime.datetime(2014, 5, 11, 20, 5, 11, 688051) # The current year now.year 2014 # The current month now.month 5 # The current day now.day 11 # The current hour now.hour 20 # The current minute now.minute 5 # The difference between two dates delta = datetime(2011, 1, 7) - datetime(2011, 1, 6) delta datetime.</description>

</item>

<item>

<title>Delete Duplicates In pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_delete_duplicates/</guid>

<description>import modules import pandas as pd Create dataframe with duplicates raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Jason&#39;, &#39;Jason&#39;,&#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;last_name&#39;: [&#39;Miller&#39;, &#39;Miller&#39;, &#39;Miller&#39;,&#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;], &#39;age&#39;: [42, 42, 1111111, 36, 24, 73], &#39;preTestScore&#39;: [4, 4, 4, 31, 2, 3], &#39;postTestScore&#39;: [25, 25, 25, 57, 62, 70]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;last_name&#39;, &#39;age&#39;, &#39;preTestScore&#39;, &#39;postTestScore&#39;]) df .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .</description>

</item>

<item>

<title>Descriptive Statistics For pandas Dataframe</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_dataframe_descriptive_stats/</guid>

<description>Import modules import pandas as pd Create dataframe data = {&#39;name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;age&#39;: [42, 52, 36, 24, 73], &#39;preTestScore&#39;: [4, 24, 31, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70]} df = pd.DataFrame(data, columns = [&#39;name&#39;, &#39;age&#39;, &#39;preTestScore&#39;, &#39;postTestScore&#39;]) df name age preTestScore postTestScore 0 Jason 42 4 25 1 Molly 52 24 94 2 Tina 36 31 57 3 Jake 24 2 62 4 Amy 73 3 70 5 rows × 4 columns</description>

</item>

<item>

<title>Dictionary Basics</title>

<guid>https://chrisalbon.com/python/basics/dictionary_basics/</guid>

<description>Basics Not sequences, but mappings. That is, stored by key, not relative position. Dictionaries are mutable. Build a dictionary via brackets unef_org = {&#39;name&#39; : &#39;UNEF&#39;, &#39;staff&#39; : 32, &#39;url&#39; : &#39;http://unef.org&#39;} View the variable unef_org {'name': 'UNEF', 'staff': 32, 'url': 'http://unef.org'} Build a dict via keys who_org = {} who_org[&#39;name&#39;] = &#39;WHO&#39; who_org[&#39;staff&#39;] = &#39;10&#39; who_org[&#39;url&#39;] = &#39;http://who.org&#39; View the variable who_org {'name': 'WHO', 'staff': '10', 'url': 'http://who.</description>

</item>

<item>

<title>Display Scientific Notation As Floats</title>

<guid>https://chrisalbon.com/python/basics/display_scientific_notation_as_floats/</guid>

<description>Create Values # Create a numbers in scientific notation value_scientific_notation = 6.32000000e-03 # Create a vector of numbers in scientific notation vector_scientific_notation = [6.32000000e-03, 1.80000000e+01, 2.31000000e+00, 0.00000000e+00, 5.38000000e-01, 6.57500000e+00, 6.52000000e+01, 4.09000000e+00, 1.00000000e+00, 2.96000000e+02, 1.53000000e+01, 3.96900000e+02, 4.98000000e+00] Display Values As Floats # Display value as a float &#39;{:f}&#39;.format(value_scientific_notation) '0.006320' # Display vector values as floats [&#39;{:f}&#39;.format(x) for x in vector_scientific_notation] ['0.006320', '18.000000', '2.310000', '0.000000', '0.538000', '6.575000', '65.200000', '4.090000', '1.000000', '296.</description>

</item>

<item>

<title>Drilling Down With Beautiful Soup</title>

<guid>https://chrisalbon.com/python/web_scraping/beautiful_soup_drill_down/</guid>

<description>Preliminaries # Import required modules import requests from bs4 import BeautifulSoup import pandas as pd Download the HTML and create a Beautiful Soup object # Create a variable with the URL to this tutorial url = &#39;http://en.wikipedia.org/wiki/List_of_A_Song_of_Ice_and_Fire_characters&#39; # Scrape the HTML at the url r = requests.get(url) # Turn the HTML into a Beautiful Soup object soup = BeautifulSoup(r.text, &#34;lxml&#34;) If we looked at the soup object, we&rsquo;d see that the names we want are in a heirarchical list.</description>

</item>

<item>

<title>Dropping Rows And Columns In pandas Dataframe</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_dropping_column_and_rows/</guid>

<description>Import modules import pandas as pd Create a dataframe data = {&#39;name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;year&#39;: [2012, 2012, 2013, 2014, 2014], &#39;reports&#39;: [4, 24, 31, 2, 3]} df = pd.DataFrame(data, index = [&#39;Cochice&#39;, &#39;Pima&#39;, &#39;Santa Cruz&#39;, &#39;Maricopa&#39;, &#39;Yuma&#39;]) df .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } name reports year Cochice Jason 4 2012 Pima Molly 24 2012 Santa Cruz Tina 31 2013 Maricopa Jake 2 2014 Yuma Amy 3 2014 Drop an observation (row) df.</description>

</item>

<item>

<title>Enumerate A List</title>

<guid>https://chrisalbon.com/python/data_wrangling/enumerate_a_list/</guid>

<description># Create a list of strings data = [&#39;One&#39;,&#39;Two&#39;,&#39;Three&#39;,&#39;Four&#39;,&#39;Five&#39;]# For each item in the enumerated variable, data for item in enumerate(data): # Print the whole enumerated element print(item) # Print only the value (not the index number) print(item[1]) (0, 'One') One (1, 'Two') Two (2, 'Three') Three (3, 'Four') Four (4, 'Five') Five </description>

</item>

<item>

<title>Exiting A Loop</title>

<guid>https://chrisalbon.com/python/basics/exiting_a_loop/</guid>

<description>Create A List # Create a list: armies = [&#39;Red Army&#39;, &#39;Blue Army&#39;, &#39;Green Army&#39;] Breaking Out Of A For Loop for army in armies: print(army) if army == &#39;Blue Army&#39;: print(&#39;Blue Army Found! Stopping.&#39;) break Red Army Blue Army Blue Army Found! Stopping. Notice that the loop stopped after the conditional if statement was satisfied.

Exiting If Loop Completed A loop will exit when completed, but using an else statement we can add an action at the conclusion of the loop if it hasn&rsquo;t been exited earlier.</description>

</item>

<item>

<title>Expand Cells Containing Lists Into Their Own Variables In Pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_expand_cells_containing_lists/</guid>

<description># import pandas import pandas as pd# create a dataset raw_data = {&#39;score&#39;: [1,2,3], &#39;tags&#39;: [[&#39;apple&#39;,&#39;pear&#39;,&#39;guava&#39;],[&#39;truck&#39;,&#39;car&#39;,&#39;plane&#39;],[&#39;cat&#39;,&#39;dog&#39;,&#39;mouse&#39;]]} df = pd.DataFrame(raw_data, columns = [&#39;score&#39;, &#39;tags&#39;]) # view the dataset df score tags 0 1 [apple, pear, guava] 1 2 [truck, car, plane] 2 3 [cat, dog, mouse] # expand df.tags into its own dataframe tags = df[&#39;tags&#39;].apply(pd.Series) # rename each variable is tags tags = tags.</description>

</item>

<item>

<title>Filter pandas Dataframes</title>

<guid>https://chrisalbon.com/python/data_wrangling/filter_dataframes/</guid>

<description>Import modules import pandas as pd Create Dataframe data = {&#39;name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;year&#39;: [2012, 2012, 2013, 2014, 2014], &#39;reports&#39;: [4, 24, 31, 2, 3], &#39;coverage&#39;: [25, 94, 57, 62, 70]} df = pd.DataFrame(data, index = [&#39;Cochice&#39;, &#39;Pima&#39;, &#39;Santa Cruz&#39;, &#39;Maricopa&#39;, &#39;Yuma&#39;]) df .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } coverage name reports year Cochice 25 Jason 4 2012 Pima 94 Molly 24 2012 Santa Cruz 57 Tina 31 2013 Maricopa 62 Jake 2 2014 Yuma 70 Amy 3 2014 View Column df[&#39;name&#39;] Cochice Jason Pima Molly Santa Cruz Tina Maricopa Jake Yuma Amy Name: name, dtype: object View Two Columns df[[&#39;name&#39;, &#39;reports&#39;]] .</description>

</item>

<item>

<title>Find Largest Value In A Dataframe Column</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_find_largest_value_in_column/</guid>

<description># import modules %matplotlib inline import pandas as pd import matplotlib.pyplot as plt import numpy as np# Create dataframe raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;last_name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;], &#39;age&#39;: [42, 52, 36, 24, 73], &#39;preTestScore&#39;: [4, 24, 31, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;last_name&#39;, &#39;age&#39;, &#39;preTestScore&#39;, &#39;postTestScore&#39;]) df first_name last_name age preTestScore postTestScore 0 Jason Miller 42 4 25 1 Molly Jacobson 52 24 94 2 Tina Ali 36 31 57 3 Jake Milner 24 2 62 4 Amy Cooze 73 3 70 # Index of the row with the highest value in the preTestScore column df[&#39;preTestScore&#39;].</description>

</item>

<item>

<title>Find The Max Value In A Dictionary</title>

<guid>https://chrisalbon.com/python/basics/find_the_max_value_in_a_dictionary/</guid>

<description> Create A Dictionary ages = {&#39;John&#39;: 21, &#39;Mike&#39;: 52, &#39;Sarah&#39;: 12, &#39;Bob&#39;: 43 } Find The Maximum Value Of The Values max(zip(ages.values(), ages.keys())) (52, 'Mike') </description>

</item>

<item>

<title>Find Unique Values In Pandas Dataframes</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_find_unique_values/</guid>

<description>import pandas as pd import numpy as npraw_data = {&#39;regiment&#39;: [&#39;51st&#39;, &#39;29th&#39;, &#39;2nd&#39;, &#39;19th&#39;, &#39;12th&#39;, &#39;101st&#39;, &#39;90th&#39;, &#39;30th&#39;, &#39;193th&#39;, &#39;1st&#39;, &#39;94th&#39;, &#39;91th&#39;], &#39;trucks&#39;: [&#39;MAZ-7310&#39;, np.nan, &#39;MAZ-7310&#39;, &#39;MAZ-7310&#39;, &#39;Tatra 810&#39;, &#39;Tatra 810&#39;, &#39;Tatra 810&#39;, &#39;Tatra 810&#39;, &#39;ZIS-150&#39;, &#39;Tatra 810&#39;, &#39;ZIS-150&#39;, &#39;ZIS-150&#39;], &#39;tanks&#39;: [&#39;Merkava Mark 4&#39;, &#39;Merkava Mark 4&#39;, &#39;Merkava Mark 4&#39;, &#39;Leopard 2A6M&#39;, &#39;Leopard 2A6M&#39;, &#39;Leopard 2A6M&#39;, &#39;Arjun MBT&#39;, &#39;Leopard 2A6M&#39;, &#39;Arjun MBT&#39;, &#39;Arjun MBT&#39;, &#39;Arjun MBT&#39;, &#39;Arjun MBT&#39;], &#39;aircraft&#39;: [&#39;none&#39;, &#39;none&#39;, &#39;none&#39;, &#39;Harbin Z-9&#39;, &#39;Harbin Z-9&#39;, &#39;none&#39;, &#39;Harbin Z-9&#39;, &#39;SH-60B Seahawk&#39;, &#39;SH-60B Seahawk&#39;, &#39;SH-60B Seahawk&#39;, &#39;SH-60B Seahawk&#39;, &#39;SH-60B Seahawk&#39;]} df = pd.</description>

</item>

<item>

<title>Flatten Lists Of Lists</title>

<guid>https://chrisalbon.com/python/basics/flatten_list_of_lists/</guid>

<description># Create a list containing three lists of names list_of_lists = [[&#39;Amy&#39;,&#39;Betty&#39;,&#39;Cathryn&#39;,&#39;Dana&#39;], [&#39;Elizabeth&#39;,&#39;Fay&#39;,&#39;Gora&#39;], [&#39;Heidi&#39;,&#39;Jane&#39;,&#39;Kayley&#39;]]# For each element in list_of_lists, take each element in the list flattened_list = [i for row in list_of_lists for i in row]# View the flattened list flattened_list ['Amy', 'Betty', 'Cathryn', 'Dana', 'Elizabeth', 'Fay', 'Gora', 'Heidi', 'Jane', 'Kayley'] </description>

</item>

<item>

<guid>https://chrisalbon.com/python/basics/for_loops/</guid>

<description># One at a time, assign each value of the sequence to i and, for i in [432, 342, 928, 920]: # multiply i by 10 and store the product in a new variable, x create a new variable, x, x = i * 10 # print the value of x print(x) # after the entire sequence processes, else: # print this print(&#39;All done!&#39;) 4320 3420 9280 9200 All done! </description>

</item>

<item>

<title>Formatting Numbers</title>

<guid>https://chrisalbon.com/python/basics/formatting_numbers/</guid>

<description> Create A Long Number annual_revenue = 9282904.9282872782 Format Number # Format rounded to two decimal places format(annual_revenue, &#39;0.2f&#39;) '9282904.93' # Format with commas and rounded to one decimal place format(annual_revenue, &#39;0,.1f&#39;) '9,282,904.9' # Format as scientific notation format(annual_revenue, &#39;e&#39;) '9.282905e+06' # Format as scientific notation rounded to two deciminals format(annual_revenue, &#39;0.2E&#39;) '9.28E+06' </description>

</item>

<item>

<title>Function Annotation Examples</title>

<guid>https://chrisalbon.com/python/basics/function_annotation_examples/</guid>

<description> Create A Function With Annotations &#39;&#39;&#39; Create a function. The argument &#39;text&#39; is the string to print with the default value &#39;default string&#39; and the argument The argument &#39;n&#39; is an integer of times to print with the default value of 1. The function should return a string. &#39;&#39;&#39; def print_text(text:&#39;string to print&#39;=&#39;default string&#39;, n:&#39;integer, times to print&#39;=1) -&gt; str: return text * n Run The Function # Run the function with arguments print_text(&#39;string&#39;, 4) 'stringstringstringstring' # Run the function with default arguments print_text() 'default string' </description>

</item>

<item>

<title>Function Basics</title>

<guid>https://chrisalbon.com/python/basics/function_basics/</guid>

<description>Create Function Called print_max def print_max(x, y): # if a is larger than b if x &gt; y: # then print this print(x, &#39;is maximum&#39;) # if a is equal to b elif x == y: # print this print(x, &#39;is equal to&#39;, y) # otherwise else: # print this print(y, &#39;is maximum&#39;) Run Function With Two Arguments print_max(3,4) 4 is maximum Note: By default, variables created within functions are local to the function.</description>

</item>

<item>

<title>Functions Vs. Generators</title>

<guid>https://chrisalbon.com/python/basics/functions_vs_generators/</guid>

<description>Create A Function # Create a function that def function(names): # For each name in a list of names for name in names: # Returns the name return name# Create a variable of that function students = function([&#39;Abe&#39;, &#39;Bob&#39;, &#39;Christina&#39;, &#39;Derek&#39;, &#39;Eleanor&#39;])# Run the function students 'Abe' Now we have a problem, we were only returned the name of the first student. Why? Because the function only ran the for name in names iteration once!</description>

</item>

<item>

<title>Generate Tweets Using Markov Chains</title>

<guid>https://chrisalbon.com/python/other/generate_tweets_using_markov_chain/</guid>

<description>Preliminaries import markovify Load Corpus The corpus I am using is just one I found online. The corpus you choose is central to generating realistic text.

# Get raw text as string with open(&#34;brown.txt&#34;) as f: text = f.read() Build Markov Chain # Build the model. text_model = markovify.Text(text) Generate One Tweet # Print three randomly-generated sentences of no more than 140 characters for i in range(3): print(text_model.make_short_sentence(140)) Within a month, calls were still productive and most devotees of baseball attended the dozens of them.</description>

</item>

<item>

<title>Generating Random Numbers With NumPy</title>

<guid>https://chrisalbon.com/python/basics/generating_random_numbers_with_numpy/</guid>

<description> Import Numpy import numpy as np Generate A Random Number From The Normal Distribution np.random.normal() 0.5661104974399703 Generate Four Random Numbers From The Normal Distribution np.random.normal(size=4) array([-1.03175853, 1.2867365 , -0.23560103, -1.05225393]) Generate Four Random Numbers From The Uniform Distribution np.random.uniform(size=4) array([ 0.00193123, 0.51932356, 0.87656884, 0.33684494]) Generate Four Random Integers Between 1 and 100 np.random.randint(low=1, high=100, size=4) array([96, 25, 94, 77]) </description>

</item>

<item>

<title>Generator Expressions</title>

<guid>https://chrisalbon.com/python/basics/generator_expressions/</guid>

<description># Create a list of students students = [&#39;Abe&#39;, &#39;Bob&#39;, &#39;Christina&#39;, &#39;Derek&#39;, &#39;Eleanor&#39;]# Create a generator expression that yields lower-case versions of the student&#39;s names lowercase_names = (student.lower() for student in students)# View the generator object lowercase_names &lt;generator object &lt;genexpr&gt; at 0x104837518&gt; # Get the next name lower-cased next(lowercase_names) 'abe' # Get the next name lower-cased next(lowercase_names) 'bob' # Get the remaining names lower-cased list(lowercase_names) ['christina', 'derek', 'eleanor'] </description>

</item>

<item>

<title>Geocoding And Reverse Geocoding</title>

<guid>https://chrisalbon.com/python/data_wrangling/geocoding_and_reverse_geocoding/</guid>

<description>Geocoding (converting a physical address or location into latitude/longitude) and reverse geocoding (converting a lat/long to a physical address or location) are common tasks when working with geo-data.

Python offers a number of packages to make the task incredibly easy. In the tutorial below, I use pygeocoder, a wrapper for Google&rsquo;s geo-API, to both geocode and reverse geocode.

Preliminaries First we want to load the packages we will want to use in the script.</description>

</item>

<item>

<title>Geolocate A City And Country</title>

<guid>https://chrisalbon.com/python/data_wrangling/geolocate_a_city_and_country/</guid>

<description>This tutorial creates a function that attempts to take a city and country and return its latitude and longitude. But when the city is unavailable (which is often be the case), the returns the latitude and longitude of the center of the country.

Preliminaries from geopy.geocoders import Nominatim geolocator = Nominatim() import numpy as np Create Geolocation Function def geolocate(city=None, country=None): &#39;&#39;&#39; Inputs city and country, or just country. Returns the lat/long coordinates of either the city if possible, if not, then returns lat/long of the center of the country.</description>

</item>

<item>

<title>Geolocate A City Or Country</title>

<guid>https://chrisalbon.com/python/data_wrangling/geolocate_a_city_or_country/</guid>

</item>

<item>

<title>Group A Time Series With pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_group_by_time/</guid>

<description>Import required modules import pandas as pd import numpy as np Create a dataframe df = pd.DataFrame() df[&#39;german_army&#39;] = np.random.randint(low=20000, high=30000, size=100) df[&#39;allied_army&#39;] = np.random.randint(low=20000, high=40000, size=100) df.index = pd.date_range(&#39;1/1/2014&#39;, periods=100, freq=&#39;H&#39;) df.head() german_army allied_army 2014-01-01 00:00:00 28755 33938 2014-01-01 01:00:00 25176 28631 2014-01-01 02:00:00 23261 39685 2014-01-01 03:00:00 28686 27756 2014-01-01 04:00:00 24588 25681 Truncate the dataframe df.</description>

</item>

<item>

<title>Group Bar Plot In MatPlotLib</title>

<guid>https://chrisalbon.com/python/data_visualization/matplotlib_grouped_bar_plot/</guid>

<description>Preliminaries %matplotlib inline import pandas as pd import matplotlib.pyplot as plt import numpy as np Create dataframe raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;pre_score&#39;: [4, 24, 31, 2, 3], &#39;mid_score&#39;: [25, 94, 57, 62, 70], &#39;post_score&#39;: [5, 43, 23, 23, 51]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;pre_score&#39;, &#39;mid_score&#39;, &#39;post_score&#39;]) df first_name pre_score mid_score post_score 0 Jason 4 25 5 1 Molly 24 94 43 2 Tina 31 57 23 3 Jake 2 62 23 4 Amy 3 70 51 Make plot # Setting the positions and width for the bars pos = list(range(len(df[&#39;pre_score&#39;]))) width = 0.</description>

</item>

<item>

<title>Group Data By Time</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_group_data_by_time/</guid>

<description>On March 13, 2016, version 0.18.0 of Pandas was released, with significant changes in how the resampling function operates. This tutorial follows v0.18.0 and will not work for previous versions of pandas.

First let&rsquo;s load the modules we care about

Preliminaries # Import required packages import pandas as pd import datetime import numpy as np Next, let&rsquo;s create some sample data that we can group by time as an sample.</description>

</item>

<item>

<title>Group Pandas Data By Hour Of The Day</title>

<guid>https://chrisalbon.com/python/data_wrangling/group_pandas_data_by_hour_of_the_day/</guid>

<description>Preliminaries # Import libraries import pandas as pd import numpy as np Create Data # Create a time series of 2000 elements, one very five minutes starting on 1/1/2000 time = pd.date_range(&#39;1/1/2000&#39;, periods=2000, freq=&#39;5min&#39;) # Create a pandas series with a random values between 0 and 100, using &#39;time&#39; as the index series = pd.Series(np.random.randint(100, size=2000), index=time) View Data # View the first few rows of the data series[0:10] 2000-01-01 00:00:00 40 2000-01-01 00:05:00 13 2000-01-01 00:10:00 99 2000-01-01 00:15:00 72 2000-01-01 00:20:00 4 2000-01-01 00:25:00 36 2000-01-01 00:30:00 24 2000-01-01 00:35:00 20 2000-01-01 00:40:00 83 2000-01-01 00:45:00 44 Freq: 5T, dtype: int64 Group Data By Time Of The Day # Group the data by the index&#39;s hour value, then aggregate by the average series.</description>

</item>

<item>

<title>Grouping Rows In pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_group_rows_by/</guid>

<description># Import modules import pandas as pd# Example dataframe raw_data = {&#39;regiment&#39;: [&#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;], &#39;company&#39;: [&#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;, &#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;,&#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;], &#39;name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;, &#39;Jacon&#39;, &#39;Ryaner&#39;, &#39;Sone&#39;, &#39;Sloan&#39;, &#39;Piger&#39;, &#39;Riani&#39;, &#39;Ali&#39;], &#39;preTestScore&#39;: [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]} df = pd.</description>

</item>

<item>

<title>Hard Wrapping Text</title>

<guid>https://chrisalbon.com/python/basics/hard_wrapping_text/</guid>

<description>Preliminaries import textwrap Create Text # Create some text excerpt = &#39;Then there was the bad weather. It would come in one day when the fall was over. We would have to shut the windows in the night against the rain and the cold wind would strip the leaves from the trees in the Place Contrescarpe. The leaves lay sodden in the rain and the wind drove the rain against the big green autobus at the terminal and the Café des Amateurs was crowded and the windows misted over from the heat and the smoke inside.</description>

</item>

<item>

<title>Hierarchical Data In pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_hierarchical_data/</guid>

<description># import modules import pandas as pd# Create dataframe raw_data = {&#39;regiment&#39;: [&#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Nighthawks&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Dragoons&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;, &#39;Scouts&#39;], &#39;company&#39;: [&#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;, &#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;,&#39;1st&#39;, &#39;1st&#39;, &#39;2nd&#39;, &#39;2nd&#39;], &#39;name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;, &#39;Jacon&#39;, &#39;Ryaner&#39;, &#39;Sone&#39;, &#39;Sloan&#39;, &#39;Piger&#39;, &#39;Riani&#39;, &#39;Ali&#39;], &#39;preTestScore&#39;: [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]} df = pd.</description>

</item>

<item>

<title>Histograms In MatPlotLib</title>

<guid>https://chrisalbon.com/python/data_visualization/matplotlib_histogram/</guid>

<description>Preliminaries %matplotlib inline import pandas as pd import matplotlib.pyplot as plt import numpy as np import math # Set ipython&#39;s max row display pd.set_option(&#39;display.max_row&#39;, 1000) # Set iPython&#39;s max column width to 50 pd.set_option(&#39;display.max_columns&#39;, 50) Create dataframe df = pd.read_csv(&#39;https://www.dropbox.com/s/52cb7kcflr8qm2u/5kings_battles_v1.csv?dl=1&#39;) df.head() name year battle_number attacker_king defender_king attacker_1 attacker_2 attacker_3 attacker_4 defender_1 defender_2 defender_3 defender_4 attacker_outcome battle_type major_death major_capture attacker_size defender_size attacker_commander defender_commander summer location region note 0 Battle of the Golden Tooth 298 1 Joffrey/Tommen Baratheon Robb Stark Lannister NaN NaN NaN Tully NaN NaN NaN win pitched battle 1 0 15000 4000 Jaime Lannister Clement Piper, Vance 1 Golden Tooth The Westerlands NaN 1 Battle at the Mummer's Ford 298 2 Joffrey/Tommen Baratheon Robb Stark Lannister NaN NaN NaN Baratheon NaN NaN NaN win ambush 1 0 NaN 120 Gregor Clegane Beric Dondarrion 1 Mummer's Ford The Riverlands NaN 2 Battle of Riverrun 298 3 Joffrey/Tommen Baratheon Robb Stark Lannister NaN NaN NaN Tully NaN NaN NaN win pitched battle 0 1 15000 10000 Jaime Lannister, Andros Brax Edmure Tully, Tytos Blackwood 1 Riverrun The Riverlands NaN 3 Battle of the Green Fork 298 4 Robb Stark Joffrey/Tommen Baratheon Stark NaN NaN NaN Lannister NaN NaN NaN loss pitched battle 1 1 18000 20000 Roose Bolton, Wylis Manderly, Medger Cerwyn, H.</description>

</item>

<item>

<title>How To Use Default Dicts</title>

<guid>https://chrisalbon.com/python/basics/how_to_use_default_dicts/</guid>

<description> Preliminaries import collections Create A DefaultDict Default Dicts work just like regular dictionaries, except a key is called that doesn&rsquo;t have a value, a default value (note: value, not key) is supplied.

# Create a defaultdict with the default value of 0 (int&#39;s default value is 0) arrests = collections.defaultdict(int) Add A New Key With A Value # Add an entry of a person with 10 arrests arrests[&#39;Sarah Miller&#39;] = 10# View dictionary arrests defaultdict(int, {'Sarah Miller': 10}) Add A New Key Without A Value # Add an entry of a person with no value for arrests, # thus the default value is used arrests[&#39;Bill James&#39;] 0 # View dictionary arrests defaultdict(int, {'Bill James': 0, 'Sarah Miller': 10}) </description>

</item>

<item>

<title>If Else On Any Or All Elements</title>

<guid>https://chrisalbon.com/python/basics/ifelse_on_any_or_all_elements/</guid>

<description>Preliminaries # import pandas as pd import pandas as pd Create a simulated dataset # Create an example dataframe data = {&#39;score&#39;: [1,2,3,4,5]} df = pd.DataFrame(data) df score 0 1 1 2 2 3 3 4 4 5 Does any cell equal 3? # If any element in df.score equals three, if (df.</description>

</item>

<item>

<title>Indexing And Slicing NumPy Arrays</title>

<guid>https://chrisalbon.com/python/basics/indexing_and_slicing_numpy_arrays/</guid>

<description>Slicing Arrays Explanation Of Broadcasting Unlike many other data types, slicing an array into a new variable means that any chances to that new variable are broadcasted to the original variable. Put other way, a slice is a hotlink to the original array variable, not a seperate and independent copy of it.

# Import Modules import numpy as np# Create an array of battle casualties from the first to the last battle battleDeaths = np.</description>

</item>

<item>

<title>Indexing And Slicing NumPy Arrays</title>

<guid>https://chrisalbon.com/python/basics/numpy_indexing_and_slicing/</guid>

<description># Import modules import numpy as np# Create a 2x2 array battle_deaths = [[344, 2345], [253, 4345]] deaths = np.array(battle_deaths) deaths array([[ 344, 2345], [ 253, 4345]]) # Select the top row, second item deaths[0, 1] 2345 # Select the second column deaths[:, 1] array([2345, 4345]) # Select the second row deaths[1, :] array([ 253, 4345]) # Create an array of civilian deaths civilian_deaths = np.array([4352, 233, 3245, 256, 2394]) civilian_deaths array([4352, 233, 3245, 256, 2394]) # Find the index of battles with less than 500 deaths few_civ_deaths = np.</description>

</item>

<item>

<title>Iterate An Ifelse Over A List</title>

<guid>https://chrisalbon.com/python/basics/iterate_ifelse_over_list/</guid>

<description> Create some data word_list = [&#39;Egypt&#39;, &#39;Watching&#39;, &#39;Eleanor&#39;] vowels = [&#39;A&#39;, &#39;E&#39;, &#39;I&#39;, &#39;O&#39;, &#39;U&#39;] Create a for loop # for each item in the word_list, for word in word_list: # if any word starts with e, where e is vowels, if any([word.startswith(e) for e in vowels]): # then print is valid, print(&#39;Is valid&#39;) # if not, else: # print invalid print(&#39;Invalid&#39;) Is valid Invalid Is valid </description>

</item>

<item>

<title>Iterate Over Multiple Lists Simultaneously</title>

<guid>https://chrisalbon.com/python/basics/iterate_over_multiple_lists_simultaneously/</guid>

<description> Create Two Lists names = [&#39;James&#39;, &#39;Bob&#39;, &#39;Sarah&#39;, &#39;Marco&#39;, &#39;Nancy&#39;, &#39;Sally&#39;] ages = [42, 13, 14, 25, 63, 23] Iterate Over Both Lists At Once for name, age in zip(names, ages): print(name, age) James 42 Bob 13 Sarah 14 Marco 25 Nancy 63 Sally 23 </description>

</item>

<item>

<title>Iterating Over Dictionary Keys</title>

<guid>https://chrisalbon.com/python/basics/iterating_over_dictionary_keys/</guid>

<description>Create A Dictionary Officers = {&#39;Michael Mulligan&#39;: &#39;Red Army&#39;, &#39;Steven Johnson&#39;: &#39;Blue Army&#39;, &#39;Jessica Billars&#39;: &#39;Green Army&#39;, &#39;Sodoni Dogla&#39;: &#39;Purple Army&#39;, &#39;Chris Jefferson&#39;: &#39;Orange Army&#39;}Officers {'Chris Jefferson': 'Orange Army', 'Jessica Billars': 'Green Army', 'Michael Mulligan': 'Red Army', 'Sodoni Dogla': 'Purple Army', 'Steven Johnson': 'Blue Army'} Use Dictionary Comprehension # Display all dictionary entries where the key doesn&#39;t start with &#39;Chris&#39; {keys : Officers[keys] for keys in Officers if not keys.</description>

</item>

<item>

<title>Join And Merge Pandas Dataframe</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_join_merge_dataframe/</guid>

<description>import modules import pandas as pd from IPython.display import display from IPython.display import Image Create a dataframe raw_data = { &#39;subject_id&#39;: [&#39;1&#39;, &#39;2&#39;, &#39;3&#39;, &#39;4&#39;, &#39;5&#39;], &#39;first_name&#39;: [&#39;Alex&#39;, &#39;Amy&#39;, &#39;Allen&#39;, &#39;Alice&#39;, &#39;Ayoung&#39;], &#39;last_name&#39;: [&#39;Anderson&#39;, &#39;Ackerman&#39;, &#39;Ali&#39;, &#39;Aoni&#39;, &#39;Atiches&#39;]} df_a = pd.DataFrame(raw_data, columns = [&#39;subject_id&#39;, &#39;first_name&#39;, &#39;last_name&#39;]) df_a subject_id first_name last_name 0 1 Alex Anderson 1 2 Amy Ackerman 2 3 Allen Ali 3 4 Alice Aoni 4 5 Ayoung Atiches Create a second dataframe raw_data = { &#39;subject_id&#39;: [&#39;4&#39;, &#39;5&#39;, &#39;6&#39;, &#39;7&#39;, &#39;8&#39;], &#39;first_name&#39;: [&#39;Billy&#39;, &#39;Brian&#39;, &#39;Bran&#39;, &#39;Bryce&#39;, &#39;Betty&#39;], &#39;last_name&#39;: [&#39;Bonder&#39;, &#39;Black&#39;, &#39;Balwner&#39;, &#39;Brice&#39;, &#39;Btisan&#39;]} df_b = pd.</description>

</item>

<item>

<title>Lambda Functions</title>

<guid>https://chrisalbon.com/python/basics/lambda_functions/</guid>

<description> In Python, it is possible to string lambda functions together.

Create a series, called pipeline, that contains three mini functions pipeline = [lambda x: x **2 - 1 + 5, lambda x: x **20 - 2 + 3, lambda x: x **200 - 1 + 4] For each item in pipeline, run the lambda function with x = 3 for f in pipeline: print(f(3)) 13 3486784402 265613988875874769338781322035779626829233452653394495974574961739092490901302182994384699044004 </description>

</item>

<item>

<title>List Unique Values In A pandas Column</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_list_unique_values_in_column/</guid>

<description>Special thanks to Bob Haffner for pointing out a better way of doing it.

Preliminaries # Import modules import pandas as pd # Set ipython&#39;s max row display pd.set_option(&#39;display.max_row&#39;, 1000) # Set iPython&#39;s max column width to 50 pd.set_option(&#39;display.max_columns&#39;, 50) Create an example dataframe # Create an example dataframe data = {&#39;name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;year&#39;: [2012, 2012, 2013, 2014, 2014], &#39;reports&#39;: [4, 24, 31, 2, 3]} df = pd.</description>

</item>

<item>

<title>Load A JSON File Into Pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/load_json_file_into_pandas/</guid>

<description> Preliminaries # Load library import pandas as pd Load JSON File # Create URL to JSON file (alternatively this can be a filepath) url = &#39;https://raw.githubusercontent.com/chrisalbon/simulated_datasets/master/data.json&#39; # Load the first sheet of the JSON file into a data frame df = pd.read_json(url, orient=&#39;columns&#39;) # View the first ten rows df.head(10) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } category datetime integer 0 0 2015-01-01 00:00:00 5 1 0 2015-01-01 00:00:01 5 10 0 2015-01-01 00:00:10 5 11 0 2015-01-01 00:00:11 5 12 0 2015-01-01 00:00:12 8 13 0 2015-01-01 00:00:13 9 14 0 2015-01-01 00:00:14 8 15 0 2015-01-01 00:00:15 8 16 0 2015-01-01 00:00:16 2 17 0 2015-01-01 00:00:17 1 </description>

</item>

<item>

<title>Load An Excel File Into Pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/load_excel_file_into_pandas/</guid>

<description> Preliminaries # Load library import pandas as pd Load Excel File # Create URL to Excel file (alternatively this can be a filepath) url = &#39;https://raw.githubusercontent.com/chrisalbon/simulated_datasets/master/data.xlsx&#39; # Load the first sheet of the Excel file into a data frame df = pd.read_excel(url, sheetname=0, header=1) # View the first ten rows df.head(10) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } 5 2015-01-01 00:00:00 0 0 5 2015-01-01 00:00:01 0 1 9 2015-01-01 00:00:02 0 2 6 2015-01-01 00:00:03 0 3 6 2015-01-01 00:00:04 0 4 9 2015-01-01 00:00:05 0 5 7 2015-01-01 00:00:06 0 6 1 2015-01-01 00:00:07 0 7 6 2015-01-01 00:00:08 0 8 9 2015-01-01 00:00:09 0 9 5 2015-01-01 00:00:10 0 </description>

</item>

<item>

<title>Load Excel Spreadsheet As pandas Dataframe</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_dataframe_load_xls/</guid>

<description># import modules import pandas as pd# Import the excel file and call it xls_file xls_file = pd.ExcelFile(&#39;../data/example.xls&#39;) xls_file &lt;pandas.io.excel.ExcelFile at 0x111912be0&gt; # View the excel file&#39;s sheet names xls_file.sheet_names ['Sheet1'] # Load the xls file&#39;s Sheet1 as a dataframe df = xls_file.parse(&#39;Sheet1&#39;) df year deaths_attacker deaths_defender soldiers_attacker soldiers_defender wounded_attacker wounded_defender 0 1945 425 423 2532 37235 41 14 1 1956 242 264 6346 2523 214 1424 2 1964 323 1231 3341 2133 131 131 3 1969 223 23 6732 1245 12 12 4 1971 783 23 12563 2671 123 34 5 1981 436 42 2356 7832 124 124 6 1982 324 124 253 2622 264 1124 7 1992 3321 631 5277 3331 311 1431 8 1999 262 232 2732 2522 132 122 9 2004 843 213 6278 26773 623 2563 </description>

</item>

<item>

<title>Loading A CSV Into pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_dataframe_importing_csv/</guid>

<description>import modules import pandas as pd import numpy as np Create dataframe (that we will be importing) raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;last_name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#34;.&#34;, &#39;Milner&#39;, &#39;Cooze&#39;], &#39;age&#39;: [42, 52, 36, 24, 73], &#39;preTestScore&#39;: [4, 24, 31, &#34;.&#34;, &#34;.&#34;], &#39;postTestScore&#39;: [&#34;25,000&#34;, &#34;94,000&#34;, 57, 62, 70]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;last_name&#39;, &#39;age&#39;, &#39;preTestScore&#39;, &#39;postTestScore&#39;]) df .dataframe thead tr:only-child th { text-align: right; } .</description>

</item>

<item>

<title>Logical Operations</title>

<guid>https://chrisalbon.com/python/basics/logical_operations/</guid>

<description>Create some simulated variables x = 6y = 9z = 12 x or y x or y 6 x and y x and y 9 not x not x False x is equal to y x == y False x is not equal to y x != y True One is less than two 1 &lt; 2 True Two is less than or equal to four 2 &lt;= 4 True Three is equal to five 3 == 5 False Three is not equal to four 3 !</description>

</item>

<item>

<title>Long To Wide Format</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_long_to_wide/</guid>

<description>import modules import pandas as pd Create &ldquo;long&rdquo; dataframe raw_data = {&#39;patient&#39;: [1, 1, 1, 2, 2], &#39;obs&#39;: [1, 2, 3, 1, 2], &#39;treatment&#39;: [0, 1, 0, 1, 0], &#39;score&#39;: [6252, 24243, 2345, 2342, 23525]} df = pd.DataFrame(raw_data, columns = [&#39;patient&#39;, &#39;obs&#39;, &#39;treatment&#39;, &#39;score&#39;]) df patient obs treatment score 0 1 1 0 6252 1 1 2 1 24243 2 1 3 0 2345 3 2 1 1 2342 4 2 2 0 23525 Make a &ldquo;wide&rdquo; data Now we will create a &ldquo;wide&rdquo; dataframe with the rows by patient number, the columns being by observation number, and the cell values being the score values.</description>

</item>

<item>

<title>Looping Over Two Lists</title>

<guid>https://chrisalbon.com/python/basics/looping_over_two_lists/</guid>

<description># Create a list of length 3: armies = [&#39;Red Army&#39;, &#39;Blue Army&#39;, &#39;Green Army&#39;] # Create a list of length 4: units = [&#39;Red Infantry&#39;, &#39;Blue Armor&#39;,&#39;Green Artillery&#39;,&#39;Orange Aircraft&#39;]# For each element in the first list, for army, unit in zip(armies, units): # Display the corresponding index element of the second list: print(army, &#39;has the following options:&#39;, unit) Red Army has the following options: Red Infantry Blue Army has the following options: Blue Armor Green Army has the following options: Green Artillery Notice that the fourth item of the second list, orange aircraft, did not display.</description>

</item>

<item>

<title>Lower Case Column Names In Pandas Dataframe</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_lowercase_column_names/</guid>

<description>Preliminaries # Import modules import pandas as pd # Set ipython&#39;s max row display pd.set_option(&#39;display.max_row&#39;, 1000) # Set iPython&#39;s max column width to 50 pd.set_option(&#39;display.max_columns&#39;, 50) Create an example dataframe # Create an example dataframe data = {&#39;NAME&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;YEAR&#39;: [2012, 2012, 2013, 2014, 2014], &#39;REPORTS&#39;: [4, 24, 31, 2, 3]} df = pd.DataFrame(data, index = [&#39;Cochice&#39;, &#39;Pima&#39;, &#39;Santa Cruz&#39;, &#39;Maricopa&#39;, &#39;Yuma&#39;]) df NAME REPORTS YEAR Cochice Jason 4 2012 Pima Molly 24 2012 Santa Cruz Tina 31 2013 Maricopa Jake 2 2014 Yuma Amy 3 2014 Lowercase column values # Map the lowering function to all column names df.</description>

</item>

<item>

<title>Make New Columns Using Functions</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_make_new_columns_using_functions/</guid>

</item>

<item>

<title>Making A Matplotlib Scatterplot From A Pandas Dataframe</title>

<guid>https://chrisalbon.com/python/data_visualization/matplotlib_scatterplot_from_pandas/</guid>

<description>import modules %matplotlib inline import pandas as pd import matplotlib.pyplot as plt import numpy as np Create dataframe raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;last_name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;], &#39;female&#39;: [0, 1, 1, 0, 1], &#39;age&#39;: [42, 52, 36, 24, 73], &#39;preTestScore&#39;: [4, 24, 31, 2, 3], &#39;postTestScore&#39;: [25, 94, 57, 62, 70]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;last_name&#39;, &#39;age&#39;, &#39;female&#39;, &#39;preTestScore&#39;, &#39;postTestScore&#39;]) df .</description>

</item>

<item>

<title>Map External Values To Dataframe Values in pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_map_values_to_values/</guid>

<description>import modules import pandas as pd Create dataframe raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, &#39;Molly&#39;, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;last_name&#39;: [&#39;Miller&#39;, &#39;Jacobson&#39;, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;], &#39;age&#39;: [42, 52, 36, 24, 73], &#39;city&#39;: [&#39;San Francisco&#39;, &#39;Baltimore&#39;, &#39;Miami&#39;, &#39;Douglas&#39;, &#39;Boston&#39;]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;last_name&#39;, &#39;age&#39;, &#39;city&#39;]) df first_name last_name age city 0 Jason Miller 42 San Francisco 1 Molly Jacobson 52 Baltimore 2 Tina Ali 36 Miami 3 Jake Milner 24 Douglas 4 Amy Cooze 73 Boston Create a dictionary of values city_to_state = { &#39;San Francisco&#39; : &#39;California&#39;, &#39;Baltimore&#39; : &#39;Maryland&#39;, &#39;Miami&#39; : &#39;Florida&#39;, &#39;Douglas&#39; : &#39;Arizona&#39;, &#39;Boston&#39; : &#39;Massachusetts&#39;} Map the values of the city_to_state dictionary to the values in the city variable df[&#39;state&#39;] = df[&#39;city&#39;].</description>

</item>

<item>

<title>Mathematical Operations</title>

<guid>https://chrisalbon.com/python/basics/math_operations/</guid>

<description>Import the math module import math Display the value of pi. math.pi 3.141592653589793 Display the value of e. math.e 2.718281828459045 Sine, cosine, and tangent math.sin(2 * math.pi / 180) 0.03489949670250097 Exponent 2 ** 4, pow(2, 4) (16, 16) Absolute value abs(-20) 20 Summation sum((1, 2, 3, 4)) 10 Minimum min(3, 9, 10, 12) 3 Maximum max(3, 5, 10, 15) 15 Floor math.</description>

</item>

<item>

<title>Matplotlib, A Simple Example</title>

<guid>https://chrisalbon.com/python/data_visualization/matplotlib_simple_example/</guid>

<description> Tell Jupyter to load matplotlib and display all visuals created inline (that is, on this page) %matplotlib inline Import matplotlib&rsquo;s pyplot module import matplotlib.pyplot as pyplot Create a simple plot pyplot.plot([1.6, 2.7]) [&lt;matplotlib.lines.Line2D at 0x10c4e7978&gt;] </description>

</item>

<item>

<title>Mine Twitter's Stream For Hashtags Or Words</title>

<guid>https://chrisalbon.com/python/other/mine_a_twitter_hashtags_and_words/</guid>

<description>This is a script which monitor&rsquo;s Twitter for tweets containing certain hashtags, words, or phrases. When one of those appears, it saves that tweet, and the user&rsquo;s information to a csv file. A similar version of this script is available on GitHub here. The main difference between the code presented here and the repo is that here I am added extensive comments in the code explaining what is happening. Also, the code below runs as a Jupyter notebook.</description>

</item>

<item>

<title>Missing Data In pandas Dataframes</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_missing_data/</guid>

<description>import modules import pandas as pd import numpy as np Create dataframe with missing values raw_data = {&#39;first_name&#39;: [&#39;Jason&#39;, np.nan, &#39;Tina&#39;, &#39;Jake&#39;, &#39;Amy&#39;], &#39;last_name&#39;: [&#39;Miller&#39;, np.nan, &#39;Ali&#39;, &#39;Milner&#39;, &#39;Cooze&#39;], &#39;age&#39;: [42, np.nan, 36, 24, 73], &#39;sex&#39;: [&#39;m&#39;, np.nan, &#39;f&#39;, &#39;m&#39;, &#39;f&#39;], &#39;preTestScore&#39;: [4, np.nan, np.nan, 2, 3], &#39;postTestScore&#39;: [25, np.nan, np.nan, 62, 70]} df = pd.DataFrame(raw_data, columns = [&#39;first_name&#39;, &#39;last_name&#39;, &#39;age&#39;, &#39;sex&#39;, &#39;preTestScore&#39;, &#39;postTestScore&#39;]) df first_name last_name age sex preTestScore postTestScore 0 Jason Miller 42.</description>

</item>

<item>

<title>Mocking Functions</title>

<guid>https://chrisalbon.com/python/basics/mocking_functions/</guid>

<description>Preliminaries import unittest import mock from math import exp The Scenario Imagine we have a function that takes in some external API or database and we want to test that function, but with fake (or mocked) inputs. The Python mock library lets us do that.

For this tutorial pretend that math.exp is some expensive operation (e.g. database query, API call, etc) that costs \$10,000 every time we use it. To test it without paying \$10,000, we can create mock_function which imitates the behavior of math.</description>

</item>

<item>

<title>Monitor A Website For Changes With Python</title>

<guid>https://chrisalbon.com/python/web_scraping/monitor_a_website/</guid>

<description>In this snippet, we create a continous loop that, at set times, scrapes a website, checks to see if it contains some text and if so, emails me. Specifically I used this script to find when Venture Beat had published an article about my company.

It should be noted that there are more efficient ways of setting scripts to run at certain times, notable cron. However, this is a quick and dirty solution.</description>

</item>

<item>

<title>Moving Averages In pandas</title>

<guid>https://chrisalbon.com/python/data_wrangling/pandas_moving_average/</guid>

<description>Import Modules # Import pandas import pandas as pd Create Dataframe # Create data data = {&#39;score&#39;: [1,1,1,2,2,2,3,3,3]} # Create dataframe df = pd.DataFrame(data) # View dataframe df .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } score 0 1 1 1 2 1 3 2 4 2 5 2 6 3 7 3 8 3 Calculate Rolling Mean # Calculate the moving average.</description>

</item>

<item>

<title>Nested For Loops Using List Comprehension</title>

View remainder of file in raw view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

index.xml

Latest commit

History

index.xml

File metadata and controls