NumPy interview questions with brief answers, NumPy, short for Numerical Python, is a powerful library for numerical computations in Python. It forms the foundation for many scientific computing and data science libraries in Python, including Pandas, SciPy, and TensorFlow. Here’s an overview of what makes NumPy essential.
1. What is NumPy, and why is it used?
- Answer:
NumPy is a powerful Python library for numerical computing. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. It's widely used for scientific and mathematical computations due to its efficiency and array-manipulation capabilities.
2. Explain the main difference between a Python list and a NumPy array.
- Answer:
Python lists can hold different data types and are slower for numerical computations. NumPy arrays are homogenous (all elements must be of the same data type), provide vectorized operations, and are more memory-efficient and faster.
3. How do you create a NumPy array?
- Answer:
You can create an array usingnp.array()
by passing a list, e.g.,np.array([1, 2, 3])
. You can also use functions likenp.zeros()
,np.ones()
, andnp.arange()
for specific patterns of arrays.
4. What are broadcasting and its rules in NumPy?
- Answer:
Broadcasting allows NumPy to perform operations on arrays of different shapes. The rules require that dimensions be compatible: either equal, one of them is 1, or the shorter array is padded with 1 on its left.
5. How do you perform element-wise operations on arrays?
- Answer:
Element-wise operations are straightforward in NumPy. You can use operators (+, -, *, /) directly on arrays of the same shape or use broadcasting for arrays of compatible shapes.
6. What does the reshape()
function do?
- Answer:
changes the shape of an array without altering its data. For example,
reshape()np.array([1, 2, 3, 4]).reshape(2, 2)
transforms a 1D array into a 2x2 array.
7. How do you concatenate arrays in NumPy?
- Answer: You can use
np.concatenate()
to join arrays along an axis. Other functions likenp.vstack()
(vertical stack) andnp.hstack()
(horizontal stack) are also available for specific stacking needs.
8. Explain the difference between np.copy()
and np.view()
.
- Answer:
creates a deep copy of the array, meaning changes in the new array don’t affect the original.
np.copy()np.view()
creates a shallow copy, where modifying one affects the other.
9. What are the advantages of vectorization in NumPy?
- Answer:
Vectorization allows operations on entire arrays without explicit loops, making code more efficient, faster, and easier to read, as NumPy internally optimizes these operations.
10. How would you calculate the mean, median, and standard deviation of an array?
- Answer:
Usenp.mean(array)
,np.median(array)
, andnp.std(array)
respectively.
11. What is the purpose of np.where()
?
- Answer:
is a conditional function that returns elements from
np.where(condition, x, y)x
wherecondition
is true, and fromy
otherwise. It’s useful for conditional selections.
12. How can you generate random numbers using NumPy?
- Answer:
Thenp.random
module provides functions likenp.random.rand()
for uniform distribution,np.random.randn()
for normal distribution, andnp.random.randint()
for random integers.
13. Explain the difference between np.dot()
and np.matmul()
.
- Answer:
is used for inner products of vectors, matrix multiplication, and tensor products depending on the input.
np.dot()np.matmul()
is specifically for matrix multiplication and follows strict linear algebra rules for 2D arrays.
14. What is np.linspace()
and how does it differ from np.arange()
?
- Answer:
generates
np.linspace(start, stop, num)num
evenly spaced numbers betweenstart
andstop
. In contrast,np.arange(start, stop, step)
generates values with a specificstep
, not necessarily evenly spaced over a specific range.
15. How can you find unique elements in a NumPy array?
- Answer:
Usenp.unique(array)
to get the unique elements. This function also has options to return counts or the indices of unique elements in the array.
16. How do you handle missing values in a NumPy array?
- Answer:
NumPy doesn’t have built-in handling for missing values, but you can represent them usingnp.nan
(for floating-point arrays) or use masked arrays vianp.ma.masked_array()
. Functions likenp.nanmean()
andnp.isnan()
are helpful when working withnp.nan
values.
17. Explain the use of axis
in NumPy functions.
- Answer:
Theaxis
parameter specifies the dimension along which a function is applied. For example, in a 2D array,axis=0
refers to columns, andaxis=1
refers to rows. Functions likenp.sum()
,np.mean()
, etc., take anaxis
argument to apply the operation along that specific dimension.
18. What is slicing in NumPy, and how does it work?
- Answer:
Slicing extracts a subset of an array. You specify start, stop, and step, e.g.,array[start:stop:step]
. For multi-dimensional arrays, you can slice each axis separately:array[start1:stop1, start2:stop2]
.
19. How do you flatten a multi-dimensional array?
- Answer:
Usearray.flatten()
orarray.ravel()
. Both convert multi-dimensional arrays to 1D arrays, butflatten()
returns a copy, whileravel()
returns a view when possible.
20. How can you sort a NumPy array?
- Answer:
sorts the array along a specified axis. You can also use
np.sort(array)array.argsort()
to get the indices that would sort the array.
21. What are structured arrays in NumPy?
- Answer:
Structured arrays allow you to store mixed data types in each row, like a database table with columns of different data types. Define it using a structured data type (dtype
), e.g.,np.array([(1, 'A'), (2, 'B')], dtype=[('col1', int), ('col2', 'U1')])
.
22. How does NumPy handle memory layout, and what are C-contiguous and F-contiguous arrays?
- Answer:
NumPy supports both C-contiguous (row-major, or row-first) and F-contiguous (column-major, or column-first) memory layouts.np.array(order='C')
andnp.array(order='F')
specify the order, which can impact performance, especially with large arrays and when interfacing with other libraries.
23. Explain the purpose of np.meshgrid()
and its usage.
- Answer:
creates coordinate matrices from coordinate vectors. It’s useful in generating a grid of values for plotting and vectorized evaluations of 2D functions. Example:
np.meshgrid()np.meshgrid(x, y)
wherex
andy
are 1D arrays.
24. What does np.argmax()
and np.argmin()
do?
- Answer:
returns the index of the maximum value in an array, while
np.argmax()np.argmin()
returns the index of the minimum value. You can specify an axis to get indices along a specific dimension.
25. What is vectorization, and why is it important in NumPy?
- Answer:
Vectorization in NumPy refers to performing operations on entire arrays without explicit loops, making operations faster by utilizing low-level optimizations. This enhances both performance and code readability.
26. How can you perform matrix inversion and matrix determinant in NumPy?
- Answer:
Usenp.linalg.inv(matrix)
for matrix inversion (if the matrix is square and invertible) andnp.linalg.det(matrix)
for the determinant.
27. Explain how np.einsum()
works and its applications.
- Answer:
is used for complex summations and broadcasting operations based on Einstein notation. It provides flexibility in performing operations like matrix multiplication, transpositions, and other tensor manipulations in a single function call.
np.einsum()
28. What is the difference between np.all()
and np.any()
?
- Answer:
returns
np.all()True
if all elements in an array satisfy a condition, whilenp.any()
returnsTrue
if at least one element satisfies the condition. Both support anaxis
argument for multi-dimensional arrays.
29. How can you split a NumPy array?
- Answer:
Use functions likenp.split(array, sections, axis)
for splitting along an axis. Other functions likenp.array_split()
,np.hsplit()
, andnp.vsplit()
are available for horizontal, vertical, or uneven splits.
30. What are the key differences between np.dot()
and np.outer()
?
- Answer:
performs a dot product for 1D arrays or matrix multiplication for 2D arrays.
np.dot()np.outer()
computes the outer product, producing a 2D array by multiplying each element of the first array by each element of the second.
31. How can you change the data type of an array in NumPy?
- Answer:
Usearray.astype(new_dtype)
, which returns a new array with the specified data type. This is useful for converting data types, e.g., fromfloat
toint
.
32. Explain np.cumsum()
and np.cumprod()
.
- Answer:
returns the cumulative sum of elements along a specified axis, while
np.cumsum()np.cumprod()
returns the cumulative product. These functions are useful for accumulating values over an array.
33. How do you find the intersection and union of two arrays?
- Answer:
finds the intersection, and
np.intersect1d(array1, array2)np.union1d(array1, array2)
finds the union. These functions are helpful for set operations on arrays.
34. What is the purpose of np.tile()
and np.repeat()
?
- Answer:
repeats an array along specified dimensions, while
np.tile(array, reps)np.repeat(array, repeats, axis)
repeats elements of an array along a specified axis.
35. How would you create a random sample from a probability distribution in NumPy?
- Answer:
NumPy offers several random sampling functions. For instance,np.random.normal(mean, std_dev, size)
samples from a normal distribution,np.random.poisson(lam, size)
samples from a Poisson distribution, andnp.random.choice(array, size)
samples with replacement from an array.
36. How can you save and load NumPy arrays to/from disk?
- Answer:
Usenp.save('filename.npy', array)
to save an array in binary format andnp.load('filename.npy')
to load it back. For multiple arrays, usenp.savez('filename.npz', array1, array2)
. You can also save in text format withnp.savetxt()
andnp.loadtxt()
.
37. What is memory mapping in NumPy, and when would you use it?
- Answer:
Memory mapping is loading data from disk into memory only when accessed, rather than all at once. This is useful for handling large datasets that don’t fit into memory. You can create a memory-mapped array withnp.memmap()
.
38. Explain the difference between np.sum()
and np.add.reduce()
.
- Answer:
calculates the sum of array elements, while
np.sum()np.add.reduce()
is a more flexible function that usesufunc.reduce()
to apply the addition function across elements, useful in custom aggregation operations.
39. How does NumPy handle array broadcasting internally?
- Answer:
Internally, NumPy “stretches” arrays by replicating values along the dimensions where shapes are 1 to match compatible shapes. This avoids copying data but allows for element-wise operations over larger shapes.
40. What is a universal function (ufunc) in NumPy?
- Answer:
Ufuncs are functions that operate element-wise on arrays, likenp.add
,np.subtract
, andnp.sin
. They are optimized for vectorized operations and support broadcasting, making array operations faster and more efficient.
41. Explain the use of np.fromfunction()
and provide an example.
- Answer:
constructs an array by executing a function over each coordinate of the array. For example,
np.fromfunction()np.fromfunction(lambda i, j: i + j, (3, 3))
creates a 3x3 array where each element is the sum of its indices.
42. What are views and copies in NumPy, and how do you differentiate them?
- Answer:
A view is a shallow copy that reflects changes in the original array. A copy is a deep copy that is independent of the original. Usearray.copy()
for a deep copy and slicing orarray.view()
for a view.
43. What does np.gradient()
do?
- Answer:
calculates the numerical gradient (rate of change) of an array, useful for finding derivatives in numerical computations. It’s particularly helpful in scientific and engineering contexts.
np.gradient(array)
44. How do you perform element-wise logical operations in NumPy?
- Answer:
Use logical ufuncs likenp.logical_and()
,np.logical_or()
, andnp.logical_not()
for element-wise logical operations. For example,np.logical_and(array1, array2)
performs element-wiseAND
on the arrays.
45. What is np.meshgrid()
commonly used for in scientific computing?
- Answer:
creates coordinate grids, which are essential for visualizations and evaluations over 2D spaces, such as plotting contours, surfaces, and vector fields.
np.meshgrid()
46. Explain the purpose and usage of np.clip()
.
- Answer:
limits the values in an array to a specified range. Values below
np.clip(array, min, max)min
are set tomin
, and values abovemax
are set tomax
. It’s commonly used in image processing and numerical stabilizations.
47. How can you efficiently compute the Euclidean distance between two points in NumPy?
- Answer:
Usenp.linalg.norm(point1 - point2)
to calculate the Euclidean distance. Alternatively,np.sqrt(np.sum((point1 - point2)**2))
provides the same result manually.
48. What are masked arrays in NumPy, and when would you use them?
- Answer:
Masked arrays allow you to “mask” certain elements in an array, ignoring them in computations. Usenp.ma.array(data, mask=condition)
to handle missing or invalid data without removing elements.
49. Explain the difference between np.floor()
, np.ceil()
, and np.round()
.
- Answer:
rounds elements down to the nearest integer,
np.floor()np.ceil()
rounds elements up, andnp.round()
rounds to the nearest integer, following standard rounding rules.
50. How would you solve a system of linear equations with NumPy?
- Answer:
Usenp.linalg.solve(A, B)
whereA
is the coefficient matrix, andB
is the output vector or matrix. It returns the values of variables that satisfy the equations.
51. What does the np.dot()
function do for 1D, 2D, and higher-dimensional arrays?
- Answer:
For 1D arrays,np.dot()
computes the inner product. For 2D arrays, it performs matrix multiplication. For higher-dimensional arrays, it performs a sum-product over the last axis of the first array and the second-to-last axis of the second array.
52. What is the difference between np.allclose()
and np.isclose()
?
- Answer:
checks if all elements in two arrays are approximately equal within a tolerance.
np.allclose(array1, array2)np.isclose(array1, array2)
returns an array of booleans indicating element-wise closeness.
53. How does NumPy handle complex numbers?
- Answer:
NumPy supports complex numbers withdtype=complex
(orcomplex64
,complex128
). Use functions likenp.real
,np.imag
,np.conj
to access the real, imaginary, and conjugate parts of complex arrays.
54. What does np.diff()
do?
- Answer:
calculates the discrete difference between consecutive elements in an array, useful for finding changes between elements in a sequence.
np.diff(array)
55. How do you calculate eigenvalues and eigenvectors in NumPy?
- Answer:
Usenp.linalg.eig(matrix)
to compute the eigenvalues and eigenvectors of a square matrix. The function returns an array of eigenvalues and an array of corresponding eigenvectors.
56. What is the difference between np.mean()
and np.average()
in NumPy?
- Answer:
calculates the simple average of array elements, while
np.mean()np.average()
allows for weighted averages when aweights
parameter is provided.
57. How can you compute the inverse of a matrix in NumPy?
- Answer:
Usenp.linalg.inv(matrix)
to compute the inverse. The matrix must be square and non-singular (having a non-zero determinant).
58. What does np.tril()
and np.triu()
do?
- Answer:
returns the lower triangular part of the matrix, setting elements above the main diagonal to zero.
np.tril(matrix)np.triu(matrix)
returns the upper triangular part, setting elements below the main diagonal to zero.
59. Explain np.cov()
and its usage.
- Answer:
computes the covariance matrix, which shows how variables vary in relation to each other. It’s often used in statistical analysis and data science for exploring correlations between variables.
np.cov(array)
60. What does np.cross()
compute?
- Answer:
calculates the cross product of two vectors in 3D space. It’s commonly used in physics and engineering applications involving vector calculations.
np.cross(a, b)
No comments:
Post a Comment