I am cleaning some data for my practice. The restriction is I am not using Pandas, so I am doing it with regular python.
My Dara contains a list of lists, consider this
dataset = [["My name is Anas", 1.92],["I am data Scientist",1.88],["I am Studying BSCS",2.0]]
The float number on every list's first index ensures that the dataset
has a list of lists with multiple values.
My code is
for i in dataset:
for j in i:
print(j[0].split())
the output is now ["My","Name","is","Anas"]
and same for all
I want my output should be look like this ["M","y","N","a","m","e","i","s","A","n","a","s"]
or like this
[['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's'],[ 'I', 'a', 'm', 'd', 'a', 't', 'a', 'S', 'c', 'i', 'e', 'n', 't', 'i', 's', 't'], ['I', 'a', 'm', 'S', 't', 'u', 'd', 'y', 'i', 'n', 'g', 'B', 'S', 'C', 'S']]
How to optimize this code? please reply with your valuable answers.
CodePudding user response:
If it is not important for the exercise that you use a nested for
, then you can simply use one for
and the fact that a string is already kind of a list.
dataset = [
["My name is Anas", 1.92],
["I am data Scientist",1.88],
["I am Studying BSCS",2.0]
]
for row in dataset:
print(list(row[0]))
if a result that excludes spaces is important, then you can do that like this using the prior answer and a list comprehension to exclude spaces.:
dataset = [
["My name is Anas", 1.92],
["I am data Scientist",1.88],
["I am Studying BSCS",2.0]
]
for row in dataset:
print([
character
for character
in row[0]
if character.strip()
])
If the use of a nested for
is central to the task, then I would do something like this:
dataset = [
["My name is Anas", 1.92],
["I am data Scientist",1.88],
["I am Studying BSCS",2.0]
]
for row in dataset:
for column in row:
if isinstance(column, str):
print(list(column))
Rather than isinnstace()
you might enumerate()
and act on only index 0 if you wished. Again this could be modified with a comprehension to exclude spaces.
CodePudding user response:
You can use:
k=[]
for i in dataset:
for j in i[0]:
if j!=' ':
k.append(j)
print(k)
#['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's', 'I', 'a', 'm', 'd', 'a', 't', 'a', 'S', 'c', 'i', 'e', 'n', 't', 'i', 's', 't', 'I', 'a', 'm', 'S', 't', 'u', 'd', 'y', 'i', 'n', 'g', 'B', 'S', 'C', 'S']
If you just want:
["M","y","N","a","m","e","i","s","A","n","a","s"]
you can do:
k=[]
for i in dataset[0][0]:
for j in i:
if j!=' ':
k.append(j)
#['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's']
Edit:
I think you are looking for this:
print([[j for j in i[0] if j!=' '] for i in dataset])
#[['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's'], ['I', 'a', 'm', 'd', 'a', 't', 'a', 'S', 'c', 'i', 'e', 'n', 't', 'i', 's', 't'], ['I', 'a', 'm', 'S', 't', 'u', 'd', 'y', 'i', 'n', 'g', 'B', 'S', 'C', 'S']]
CodePudding user response:
I guess something like this, if you don't want the spaces:
for row in dataset:
...: string=row[0]
...: print([c for c in string if c != ' '])
CodePudding user response:
splited = []
for i in dataset:
for char in i[0]:
if char != " ":
splited.append(char)
else:
pass