Home > Software design >  Python make true = 1 and false = 0
Python make true = 1 and false = 0

Time:09-29

I'm asking for a way to make 1's and 0's for those strings that include a specific piece of text.

I'm familiar with R and getting started using Python, so would love your input guidance on the below:

import pandas as pd

codes = ["G06Q0030020000 | G06Q0010040000 | G06Q0030018000 | G06Q0030060000 | G06Q0030060700 | G06Q0030060900", "C12Y0301010040 | A23L0015250000 | A23L0027600000", "A61B0018040000", "C07C0213080000 | C07C0051373000 | A61P0005000000", "B82Y0005000000 | A61K0031418800 | A61K0051109300 | A61K0047689800 | A61K0039395000 | A61K0047500000 | A61P0035000000", "A61K0008898000 | A61Q0003000000 | A61Q0005020000 | A61Q0005120000 | A61Q0019000000 | C07F0007087900 | C07F0007088900 | C08G0077382000 | C08G0077440000 | C08G0077480000 | C08G0077540000 | C07F0007083800", "G06Q0010080000", "A61K0035740000 | A61K0009505700 | A23L0029284000 | A23L0033135000 | A23P0010300000", "A61K0035740000 | A61K0009505700 | A23L0029284000 | A23L0033135000 | A23P0010300000", "G06Q0010083300 | G06Q0030027800"]
df = pd.DataFrame(codes)

#FIRST TRY - 0's ONLY
for_food = ["A21","A23","A22","C12Q","C12G"] 
for i in for_food:
    if i in df["codes"]:
        df["food"] = 1
    else:
        df["food"] = 0
    
if "A61K0008" in df["codes"]:
    df["cosmetics"] = 1
else:
    df["cosmetics"] = 0

if "A61K0035" in df["codes"]:
    df["medical"] = 1
else:
    df["medical"] = 0
    
if "G06Q" in df["codes"]:
    df["banking"] = 1
else:
    df["banking"] = 0

# SECOND TRY - GOOD FOR 1 PIECE OF TEXT (STILL NEED TO MAKE True = 1 AND False = 0)
df["medical"] = df["codes"].str.contains("A61K0035")
df["cosmetics"] = df["codes"].str.contains("A61K0008")
df["banking"] = df["codes"].str.contains("G06Q")
# BUT THE MULTIPLE DIDN'T WORK
df["food"] = df["codes"].str.contains(for_food)

# THIRD TRY (only for_food)
df["food"] = 1 for i in for_food if i in df["All CP Classifications"] else df["food"] = 0 # invalid syntax

# FOURTH TRY
df["food"] = [1 for i in for_food if df["All CP Classifications"].str.contains(i)] # The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

None of those help me build the right option for making 'food' column, could someone please guide me?

CodePudding user response:

use:

import numpy as np
df = pd.DataFrame(codes,columns=['codes'])
for_food = ["A21","A23","A22","C12Q","C12G"] 
condition=[(df['codes'].str.contains('|'.join(for_food)))]
choice=[1]
df['food'] = np.select(condition, choice, default=0)

you can use this format in other conditions. Also if you want to see 1 and 0 instead of true false you can simply use this:

#example
df["medical"] = df["medical"].astype(int)
  • Related