Splitting Strings into a single character of having while the string is having spaces-CodePudding

I am cleaning some data for my practice. The restriction is I am not using Pandas, so I am doing it with regular python.

My Dara contains a list of lists, consider this

dataset = [["My name is Anas", 1.92],["I am data Scientist",1.88],["I am Studying BSCS",2.0]]

The float number on every list's first index ensures that the dataset has a list of lists with multiple values.

My code is

for i in dataset:
    for j in i:
       print(j[0].split())

the output is now ["My","Name","is","Anas"] and same for all

I want my output should be look like this ["M","y","N","a","m","e","i","s","A","n","a","s"]

or like this

 [['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's'],[ 'I', 'a', 'm', 'd', 'a', 't', 'a', 'S', 'c', 'i', 'e', 'n', 't', 'i', 's', 't'], ['I', 'a', 'm', 'S', 't', 'u', 'd', 'y', 'i', 'n', 'g', 'B', 'S', 'C', 'S']]

How to optimize this code? please reply with your valuable answers.

CodePudding user response：

If it is not important for the exercise that you use a nested for, then you can simply use one for and the fact that a string is already kind of a list.

dataset = [
    ["My name is Anas", 1.92],
    ["I am data Scientist",1.88],
    ["I am Studying BSCS",2.0]
]

for row in dataset:
    print(list(row[0]))

if a result that excludes spaces is important, then you can do that like this using the prior answer and a list comprehension to exclude spaces.:

dataset = [
    ["My name is Anas", 1.92],
    ["I am data Scientist",1.88],
    ["I am Studying BSCS",2.0]
]

for row in dataset:
    print([
        character
        for character
        in row[0]
        if character.strip()
    ])

If the use of a nested for is central to the task, then I would do something like this:

dataset = [
    ["My name is Anas", 1.92],
    ["I am data Scientist",1.88],
    ["I am Studying BSCS",2.0]
]

for row in dataset:
    for column in row:
        if isinstance(column, str):
            print(list(column))

Rather than isinnstace() you might enumerate() and act on only index 0 if you wished. Again this could be modified with a comprehension to exclude spaces.

CodePudding user response：

You can use:

k=[]
for i in dataset:
    for j in i[0]:
        if j!=' ':
           k.append(j)

print(k)
#['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's', 'I', 'a', 'm', 'd', 'a', 't', 'a', 'S', 'c', 'i', 'e', 'n', 't', 'i', 's', 't', 'I', 'a', 'm', 'S', 't', 'u', 'd', 'y', 'i', 'n', 'g', 'B', 'S', 'C', 'S']

If you just want:

["M","y","N","a","m","e","i","s","A","n","a","s"]

you can do:

k=[]
for i in dataset[0][0]:
    for j in i:
        if j!=' ':
           k.append(j)

#['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's']

Edit:

I think you are looking for this:

print([[j for j in i[0] if j!=' '] for i in dataset])

#[['M', 'y', 'n', 'a', 'm', 'e', 'i', 's', 'A', 'n', 'a', 's'], ['I', 'a', 'm', 'd', 'a', 't', 'a', 'S', 'c', 'i', 'e', 'n', 't', 'i', 's', 't'], ['I', 'a', 'm', 'S', 't', 'u', 'd', 'y', 'i', 'n', 'g', 'B', 'S', 'C', 'S']]

CodePudding user response：

I guess something like this, if you don't want the spaces:

for row in dataset:
    ...:     string=row[0]
    ...:     print([c for c in string if c != ' '])

CodePudding user response：

splited = []

for i in dataset:
  for char in i[0]:
    if char != " ":
      splited.append(char)
    else:
      pass