Let's say I have a string ndarray
with any dimension. For example: [["abc", "def"], ["ghi", "jkl"]]
. Now, I want to split each string into separate chars, such that basically an axis in the second dimension is added: [[['a', 'd'], ['b', 'e'], ['c', 'f']], [['g', 'j'], ['h', 'k'], ['i', 'l']]]
. Or better said, it should behave like MATLAB converting a string array to a char array:
A =
2×2 string array
"abc" "def"
"ghi" "jkl"
should become:
2×3×2 char array
ans(:,:,1) =
'abc'
'ghi'
ans(:,:,2) =
'def'
'jkl'
I tried functions like np.frompyfunc
, np.apply_over_axis
and np.apply_from_axis
but so far nothing worked for me. Is there a clever trick to do this?
The reverse is actually pretty simple:
def row_to_string(row):
return ''.join([chr(int(x)) for x in row])
return np.apply_along_axis(row_to_string, 1, np.asarray(x))
CodePudding user response:
Here you go:
In [1]: A = [["abc", "def"], ["ghi", "jkl"]]
In [2]: [[[*t] for t in zip(*a)] for a in A]
Out[2]: [[['a', 'd'], ['b', 'e'], ['c', 'f']], [['g', 'j'], ['h', 'k'], ['i', 'l']]]
CodePudding user response:
In [159]: alist = [["abc", "def"], ["ghi", "jkl"]]
In [160]: np.frompyfunc(list,1,1)(alist)
Out[160]:
array([[list(['a', 'b', 'c']), list(['d', 'e', 'f'])],
[list(['g', 'h', 'i']), list(['j', 'k', 'l'])]], dtype=object)
In [161]: np.array(np.frompyfunc(list,1,1)(alist).tolist())
Out[161]:
array([[['a', 'b', 'c'],
['d', 'e', 'f']],
[['g', 'h', 'i'],
['j', 'k', 'l']]], dtype='<U1')
From there you can transpose to the desired layout.
Splitting a string into characters is a python task. Numpy doesn't have special string code, just np.char
functions that apply string methods to elements of an string dtype array.
frompyfunc
is a convenience tool for applying a python function to elements of an array. It doesn't compile anything, but usually is comparable to list comprehensions in speed, and may be more convenient.