NumPy - String Functions
The NumPy package contains a number of functions which can be used on an array with dtype string. Below mentioned are most commonly used functions for this purpose:
Function | Description |
---|---|
add() | Returns element-wise string concatenation for two arrays of string or Unicode. |
multiply() | Returns the string with multiple concatenation, element-wise. |
center() | Returns a copy of the string with elements centered in the string of specified length. |
capitalize() | Returns a copy of the string with only the first character capitalized. |
title() | Returns a copy of the string with the element-wise title cased version of the string or Unicode. |
lower() | Returns a copy of the string with the elements converted to lowercase. |
upper() | Returns a copy of the string with the elements converted to uppercase. |
split() | Returns a list of words in the string. |
splitlines() | Returns a list of lines in the string, breaking at line boundaries. |
strip() | Returns a copy of the string with the leading and trailing white spaces removed. |
join() | Returns a string which is the concatenation of the strings in the sequence. |
replace() | Returns a copy of the string with all occurrences of substring replaced by the new string. |
decode() | Used to decode the specified string element-wise using the specified codec. |
encode() | Used to encode the decoded string element-wise. |
Lets discuss these functions in detail:
numpy.char.add() function
The numpy.char.add() function returns element-wise string concatenation for two arrays of string or Unicode.
import numpy as np Arr1 = np.array(["Hello", "Hi"]) Arr2 = np.array([" World", " Python"]) print(np.char.add(Arr1, Arr2)) print(np.char.add("Python", " Programming"))
The output of the above code will be:
['Hello World' 'Hi Python'] Python Programming
numpy.char.multiply() function
The numpy.char.multiply() function returns the string with multiple concatenation, element-wise.
import numpy as np Arr = np.array(["Hello", "Hi"]) print(np.char.multiply(Arr, 2)) print(np.char.multiply("Hi", 3))
The output of the above code will be:
['HelloHello' 'HiHi'] HiHiHi
numpy.char.center() function
The numpy.char.center() function returns a copy of the string with elements centered in the string of specified length.
import numpy as np Arr = np.array(["Learn", "Python"]) #padding the string from both side with * #to make the length of string as 10 print(np.char.center(Arr, 10, "*")) print(np.char.center("Hi", 10, "*"))
The output of the above code will be:
['**Learn***' '**Python**'] ****Hi****
numpy.char.capitalize() function
The numpy.char.capitalize() function returns a copy of the string with only the first character capitalized.
import numpy as np Arr = np.array(["learn", "python"]) print(np.char.capitalize(Arr)) print(np.char.capitalize("hi"))
The output of the above code will be:
['Learn' 'Python'] Hi
numpy.char.title() function
The numpy.char.title() function returns a copy of the string with the element-wise title cased version of the string or Unicode.
import numpy as np Arr = np.array(["learn python", "hello world"]) print(np.char.title(Arr)) print(np.char.title("hi john"))
The output of the above code will be:
['Learn Python' 'Hello World'] Hi John
numpy.char.lower() function
The numpy.char.lower() function returns a copy of the string with the elements converted to lowercase.
import numpy as np Arr = np.array(["lEArn", "pyTHon"]) print(np.char.lower(Arr)) print(np.char.lower("HeLLo"))
The output of the above code will be:
['learn' 'python'] hello
numpy.char.upper() function
The numpy.char.upper() function returns a copy of the string with the elements converted to uppercase.
import numpy as np Arr = np.array(["lEArn", "pyTHon"]) print(np.char.upper(Arr)) print(np.char.upper("HeLLo"))
The output of the above code will be:
['LEARN' 'PYTHON'] HELLO
numpy.char.split() function
The numpy.char.split() function returns a list of words in the string.
import numpy as np Arr = np.array(["Learn Python", "Hello", "World"]) print(np.char.split(Arr)) print(np.char.split("Hi John"))
The output of the above code will be:
[list(['Learn', 'Python']) list(['Hello']) list(['World'])] ['Hi', 'John']
numpy.char.splitlines() function
The numpy.char.splitlines() function returns a list of lines in the string, breaking at line boundaries.
import numpy as np Arr = np.array(["Programming\nis\nfun", "Python", "Hi\nJohn"]) print(np.char.splitlines(Arr)) print(np.char.splitlines("Hi\nJohn"))
The output of the above code will be:
[list(['Programming', 'is', 'fun']) list(['Python']) list(['Hi', 'John'])] ['Hi', 'John']
numpy.char.strip() function
The numpy.char.strip() function returns a copy of the string with the leading and trailing white spaces removed.
import numpy as np Arr = np.array(["Hello World ", " Python "]) print(np.char.strip(Arr)) print(np.char.strip(" Hi John "))
The output of the above code will be:
['Hello World' 'Python'] Hi John
numpy.char.join() function
The numpy.char.join() function returns a string which is the concatenation of the strings in the sequence.
import numpy as np Arr = np.array(["Hi", "Hello"]) print(np.char.join(":", Arr)) print(np.char.join(":", "HMS"))
The output of the above code will be:
['H:i' 'H:e:l:l:o'] H:M:S
numpy.char.replace() function
The numpy.char.join() function returns a copy of the string with all occurrences of substring replaced by the new string.
import numpy as np Arr = np.array(["Hello", "World"]) #replacing "World" with "Python" in Arr print(np.char.replace(Arr, "World", "Python")) #replacing "Hi" with "Hello" in given string print(np.char.replace("Hi john", "Hi", "Hello"))
The output of the above code will be:
['Hello' 'Python'] Hello john
numpy.char.decode() and numpy.char.encode() function
The numpy.char.decode() function is used to decode the specified string element-wise using the specified codec, whereas numpy.char.encode() function is used to encode the decoded string element-wise.
import numpy as np Arr = np.array(["Hello", "World"]) #encoding Arr enstr = np.char.encode(Arr, 'cp500') #decoding encoded Arr destr = np.char.decode(enstr, 'cp500') print("encoded string:", enstr) print("decoded string:", destr)
The output of the above code will be:
encoded string: [b'\xc8\x85\x93\x93\x96' b'\xe6\x96\x99\x93\x84'] decoded string: ['Hello' 'World']