I am using a CSV file with news about crypto. My goal is to practice string manipulation and methods. The CSV looks something like this :
publishdate headlinetext
20130504 COnSTELlATIon DaG iS nOW liStEd On kucoiN eXC?haNGE
20130511 ItA*lys cRypTOCUrREnCy BITgrAil suspeNds OpERatIOnS
20130511 THe diffeRENCe bETWEEn sHarEs aNd cRYpToCUrReN€CiES
20130512 fedS seIzE 47 mIlLION In bItCoinS in FAke ID ST=ing
20130514 ThE diG sTarteD ASiCboOST neTwORK AnD b@ItcoIN cAsH
20130516 BINAncE far atualiZAO progRaMadA NEsTa QuarTAFEIR?a
20130516 tHe EUropeaN UniOn IS pLaNninG tO rEgULAtE Bi=TcOIn
20130516 i!BeROBiT HELpiNG bItcOIn To GO MAinStream IN sPaIN
20130521 EuropES sMALLEr €banKS WELComE CrypTOcuRRENCy uSerS
20130604 BiTcoIn btc hIghER bTc= price BRiNgs mOrE Btc SCAmS
20130610 BITCOin brEAkS 9000 iN latEST LANDmARK Pr#iCE pOinT
20130613 ubcoiN mArkEt movEs ItS HEAdQUArTERS To €SiNgaPorE
20130624 reeds jEwelErs TaKinG bitCOiN ONLiNe And IN Sto$Res
20130705 CoNtrOvE!RSY turnS to cLoSuRe as LItePAy SHUts DowN
20130709 bUll rESiSTAncE BITCOIn pRicE nEeDs brEAK AbOve 9K*
20130714 DIVoRcE DISpUte co#Uple fIghtS ovEr 830k Of BITcoin
20130718 10K agaIn For BitcoiN buT oT!her CRyptOs OUTperfORM
20130724 FACebOoKS liBRa crYptoCUrrency wHER$E aRE ThE BANkS
20130726 COULd eNjIn coiN ReacH A neW AlltiM=E hIGh in APRIL
20130827 the GReaT Tug of WaR betWE=eN bItCOiNS anD AltCOINs
20130827 The SacRAMento kINgs mINE EthEreuM ETh for C#HArItY
20130905 cryPtOCuRREncY aTMs tHE KEY T*o WIdeSPRead ADoPtiOn
20130909 GraySCales EtHereUM TRusT pRICE= VaLUeS Eth aT 6000
(...)
Then I used pandas to read the CSV.
import pandas as pd
news_headlines = pd.read_csv('/content/sample_data/crypto_headlines.csv')
news_headlines
Now I need to get the strings to work with them and change them to lower or upper case and then remove special characters.
However , I don't know which method I should use to extract a string from this variable I created called news_headlines.
Let's say I wanted to extract the 2nd row, with the publish date on 20130511.
Any help ?
Thanks in advance
CodePudding user response:
You can use iloc()
, as in the example below, to extract the string related to the second row:
news_headlines.iloc[1, 1]