Python List and Numpy Array

Data Science from Scratch (23 Part Series)

1 Data Science from Scratch: Intro & Setup
2 Higher Order Functions
19 more parts…
3 Strings and Exceptions
4 Python List and Numpy Array
5 Dictionaries
6 defaultdict
7 Counters & Sets
8 Control Flow & Truthiness
9 List Comprehensions
10 Understanding Object-Oriented Programming with Assert
11 A Brief Foray into Lazy Evaluation
12 Pseudo-Randomness and a hint of Regex
13 Collections and Comprehensions
14 Collections and Comprehensions (pt.2)
15 Making sense of matplotlib
16 Vectors in Python
17 Matrices in Python
18 Statistics from Scratch with Python
19 Conditional Probability with Python: Concepts, Tables & Code
20 Bayes’ Theorem: Concepts and Code
21 Probability Distributions with Python: Discrete & Continuous
22 Explore Hypothesis Testing using Python
23 Grasping Gradient Descent using Python

Continuing our exploration of Data Science from Scratch by Joel Grus (ch2). You’ll note that the book emphasizes pure python with minimal libraries, so we may not see much of NumPy in the book. However, since NumPy is so pervasive for data science applications, and since lists and NumPy arrays have much overlap, I think it would be useful to use this opportunity to compare and contrast the section, knowing that for most of the book, we’ll be using Python lists.

Lists are fundamental to Python so I’m going to spend some time exploring their features. For data science, NumPy arrays are used frequently, so I thought it’d be good to implement all list operations covered in this section in Numpy arrays to tease apart their similarities and differences.

Below are the similarities.

This implies that whatever can be done in python lists can also be done in numpy arrays, including: getting the nth element in the list/array with square brackets, slicing the list/array, iterating through the list/array with start, stop, step, using the in operator to find list/array membership, checking length and unpacking list/arrays.

# setup import numpy as np

# create comparables python_list = [1,2,3,4,5,6,7,8,9]
numpy_array = np.array([1,2,3,4,5,6,7,8,9])

# bracket operations 
# get nth element with square bracket python_list[0] # 1 numpy_array[0] # 1 python_list[8] # 9 numpy_array[8] # 9 python_list[-1] # 9 numpy_array[-1] # 9 
# square bracket to slice python_list[:3] # [1, 2, 3] numpy_array[:3] # array([1, 2, 3]) 
python_list[1:5] # [2, 3, 4, 5] numpy_array[1:5] # array([2, 3, 4, 5]) 
# start, stop, step python_list[1:8:2] # [2, 4, 6, 8] numpy_array[1:8:2] # array([2, 4, 6, 8]) 
# use in operator to check membership 1 in python_list # true 1 in numpy_array # true 
0 in python_list # false 0 in numpy_array # false 
# finding length len(python_list) # 9 len(numpy_array) # 9 
# unpacking x,y = [1,2] # now x is 1, y is 2 w,z = np.array([1,2]) # now w is 1, z is 2 

Enter fullscreen mode Exit fullscreen mode

Now, here are the differences.

These tasks can be done in python lists, but require a different approach for NumPy array including: modification (extend in list, append for array). Finally, lists can store mixed data types, while NumPy array will convert to string.


# python lists can store mixed data types heterogeneous_list = ['string', 0.1, True]
type(heterogeneous_list[0]) # str type(heterogeneous_list[1]) # float type(heterogeneous_list[2]) # bool 
# numpy arrays cannot store mixed data types # numpy arrays turn all data types into strings homogeneous_numpy_array = np.array(['string', 0.1, True]) # saved with mixed data types type(homogeneous_numpy_array[0]) # numpy.str_ type(homogeneous_numpy_array[1]) # numpy.str_ type(homogeneous_numpy_array[2]) # numpy.str_ 

# modifying list vs numpy array 
# lists can use extend to modify list in place python_list.extend([10,12,13])  # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13] numpy_array.extend([10,12,13]) # AttributeError: 'numpy.ndarray' 
# numpy array must use append, instead of extend numpy_array = np.append(numpy_array,[10,12,13])

# python lists can be added with other lists new_python_list = python_list + [14,15] # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15] numpy_array + [14,15] # ValueError 
# numpy array cannot be added (use append instead) # array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15]) new_numpy_array = np.append(numpy_array, [14,15]) 

# python lists have the append attribute python_list.append(0) # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 0] 
# the append attribute for numpy array is used differently numpy_array = np.append(numpy_array, [0])

Enter fullscreen mode Exit fullscreen mode

Python lists and NumPy arrays have much in common, but there are meaningful differences as well.

Python Lists vs NumPy Arrays: What’s the difference

Now that we know that there are meaningful differences, what can we attribute these differences to? This explainer from UCF highlights performance differences including:

  • Size
  • Performance
  • Functionality

I’m tempted to go down this ️ of further lists vs array comparisons, but we’ll hold off for now.


For more content on data science, machine learning, R, Python, SQL and more, find me on Twitter.

Data Science from Scratch (23 Part Series)

1 Data Science from Scratch: Intro & Setup
2 Higher Order Functions
19 more parts…
3 Strings and Exceptions
4 Python List and Numpy Array
5 Dictionaries
6 defaultdict
7 Counters & Sets
8 Control Flow & Truthiness
9 List Comprehensions
10 Understanding Object-Oriented Programming with Assert
11 A Brief Foray into Lazy Evaluation
12 Pseudo-Randomness and a hint of Regex
13 Collections and Comprehensions
14 Collections and Comprehensions (pt.2)
15 Making sense of matplotlib
16 Vectors in Python
17 Matrices in Python
18 Statistics from Scratch with Python
19 Conditional Probability with Python: Concepts, Tables & Code
20 Bayes’ Theorem: Concepts and Code
21 Probability Distributions with Python: Discrete & Continuous
22 Explore Hypothesis Testing using Python
23 Grasping Gradient Descent using Python

原文链接:Python List and Numpy Array

© 版权声明
THE END
喜欢就支持一下吧
点赞15 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容