I'm trying to do an assignment where we're given a file of strings that contains the names of movies with their release dates and cast. Currently, I'm trying to separate the title of each movie from its cast, however whenever I run my code I get the title of a movie but a random cast member keeps appearing when that isn't supposed to happen. Does anyone know what the bug is?
The txt file is below:
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include "vector.h" //you can also use #include <vector>
using namespace std;
//----------------------------------------------------//
//Lets get the text of the file into vectors
Vector<string> movieTitle(string txtfile)
{
Vector<string> Title; //Title of the Movie
fstream myFile;
string word;
int i = 0;
myFile.open(txtfile);
if(!myFile.good())
{
cout << "ERROR: FILE NOT FOUND" << endl;
exit(1);
}
while(getline(myFile, word, '\t'))
{
Title.push_back(word);
continue;
}
myFile.close();
return Title;
}
int main()
{
Vector<string> test;
test = movieTitle("movies_mpaa.txt");
cout << test[1] << endl;
return 0;
}
Whenever I run this my output would be
Nela Wagman
Moon Knight (2022)
I'm trying to remove the Nela Wagman
.
I'm just trying to remove the string that's connected to the movie title for some reason. The movie title is separated by a tab from the cast, but for some reason the cast from the previous movie list gets connected to the upcoming movie title. I'm trying to remove this.
CodePudding user response:
(Note - I am making the huuuge assumption that Vector.h is std::vector)
Well your output makes zero sense, your program read the entire file into that one vector called Title.
Vector<string> Title; //Title of the Movie
why would you have a vector to store the title
Anyway its not clear what you are trying to do. What do you expect the output of this program to be. As I said it reads the entoire file into that one vector. I am guessing that the output you show is the tail end of a huge piece of output
First thing you need to do in understand how to get one film out of that file. The layout of the file is as follows
<title>\t<actor>\t<actor>\n
So I suggest that you go getline delimited by \n and then chop that line up delimited by tab (\t)
YOu see here that you are reading the whole file, chopped up by tab
while(getline(myFile, word, '\t'))
{
Title.push_back(word);
continue; <<<<======== not needed BTW
}
Start easy by doing this
Vector<string> films;
fstream myFile;
string word;
myFile.open(txtfile);
if(!myFile.good())
{
cout << "ERROR: FILE NOT FOUND" << endl;
exit(1);
}
while(getline(myFile, word, '\n'))
{
films.push_back(word);
}
now you will have the entire db in that vector. One entry per movvie
CodePudding user response:
You already have your answer from @pm100, but let me provide a few thoughts on why you are having difficulties in the first place as well as prevent some additional difficulties you will encounter in the future.
You have a two-part problem essentially:
- How do I parse (separate) the movie information from each line (record) in
movies_mpaa.txt
(there is information on 37215 films contained in the file); and - How do I store that information so it is easily retrievable and reasonably efficiently stored.
The format for each record in the file is:
Title (year)\tCast Member1\tCast Member2\t....\n
So you have your title
then a space and your (year)
in parenthesis followed by a '\t'
and then the names of the cast members separated by '\t'
until the end of the record.
As with any delimited record, you read the entire line into a std::string
and then create a std::stringstream
from the entire line which you can then parse. Parsing directly from the file presents problems in numerous cases where reading until a delimiter will ignore the final '\n'
and begin reading from the next record. By creating a std::stringstream
when the end is reached .eof()
is set ensuring you only operate on fields from that single record.
Splitting the first field holding the title / year can be done with .substr()
member function by reading the title from the beginning until the .find_last_of(' ');
(space) character. The year can be separated by .find_last_of('(') 1
and reading the next 4-characters.
For the cast members, you simply loop continually, isolating each cast-member name in turn and using .push_back
to add that name to your vector of strings.
To keep the data manageable, using a struct for each movie hoding the separated title
, year
and then vector of cast
makes sense. That single object can concisely contain the information on one movie.
Then a std::vector<movie>
provides a simple way to create a vector containing all of the movies you have read from the data file. You can handle that any way you like, but a struct nested in a class makes things fairly straight forward and you can write a simple overload of >>
to handle the input separation and storage, and another overload of <<
to output the details of each movie.
A Short Example
A short example putting those principles to work could have your movie
struct as a private member of a films
class where the combined data for all movies is held in a std::vector<movie>
as a private member of the surrounding class. You pass the filename to read as the first argument to the program, and if you #define PRNMOVIES
then the details of each stored movie is printed to stdout
(don't do this for the full file, use as a test on a couple of records)
In either case the total number of movies stored in your films object is shown. It can be written in a number of different ways, here is but one:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
class films {
private:
struct movie { /* struct to hold title, year, cast */
std::string title;
std::string year;
std::vector<std::string> cast;
/* overload of >> to read one movie from input into struct */
friend std::istream& operator >> ( std::istream& is, movie& m ) {
std::string record {}; /* string to hold line */
if (getline (is, record)) { /* read line */
std::stringstream ss (record); /* create stringstream fron line */
std::string titleyear{}; /* string 1st field title/year */
std::string member {}; /* string for one cast member */
/* separate title and year */
if (!getline (ss, titleyear, '\t')) {
return is;
}
/* separate title on last space, year is 4-char after last '(' */
m.title = titleyear.substr (0, titleyear.find_last_of(' '));
m.year = titleyear.substr (titleyear.find_last_of('(') 1, 4);
/* loop reading cast member and add to cast vector */
while (getline (ss, member, '\t')) {
m.cast.push_back(member);
}
}
return is;
}
/* overload of << to output movie info */
friend std::ostream& operator << ( std::ostream& os, const movie& m ) {
os << "title : " << m.title <<
"\nyear : " << m.year <<
"\ncast :\n";
for (const auto& c : m.cast) {
os << " " << c << '\n';
}
os.put ('\n');
return os;
}
/* get count of cast-members stored */
size_t get_cast_count (movie& m) { return m.cast.size(); }
};
std::vector<movie> movies; /* vector of movie for all films */
public:
films () { movies.clear(); } /* default */
films (std::istream& is) { /* construct passing istream */
while (1) { /* loop continually */
movie m{}; /* temporary movie */
if (!(is >> m)) { /* read record into m, break on fail */
break;
}
movies.push_back(m); /* add movie to movies */
}
}
/* getter - number of movies stored */
size_t get_film_count() { return movies.size(); }
void prn_films() { /* test print (DON'T run on whole file) */
for (const auto& f : movies) {
std::cout << f;
}
}
};
int main (int argc, char **argv) {
if (argc < 2) { /* validate one argument given for filename */
std::cerr << "error: insufficient arguments.\n"
"usage: " << argv[0] << " filename\n";
return 1;
}
std::ifstream f (argv[1]); /* open file */
if (! f.good()) { /* validate file open for reading */
std::cerr << "error: file open failed '" << argv[1] << "'.\n";
return 1;
}
films mpaa (f); /* construct films reading all records from stream */
/* output number of movies stored */
std::cout << "read " << mpaa.get_film_count() << " films.\n";
#ifdef PRNMOVIES
mpaa.prn_films(); /* conditionally output details if PRNMOVIES defined */
#endif
}
Example Use/Output
Running the timed program without PRNMOVIES
defined on your entire movies_mpaa.txt
file shows all 37215 films can be read and separated in a little over 0.2 seconds, e.g.
$ time ./bin/moviempaa ~/tmp/movies_mpaa.txt
read 37215 films.
real 0m0.206s
user 0m0.180s
sys 0m0.026s
Checking the detail print of the output on 2 records from the file in a short subfile created with head -n2 movies_mpaa.txt > movies_mpaa2.txt
allows you to define PRNMOVIES
and keep the output to a hundred lines or so, e.g.
$ ./bin/moviempaa2 dat/movies_mpaa2.txt
read 2 films.
title : C.O.G.
year : 2013
cast :
Danny Belrose
Alexander Chapin-Plata
Sean Ghazi
Jonathan Groff
Tommy Hestmark
Louis Hobson
Kamyar Jahan
Simos Kalivas
Timothy Levine
Castillo Morales
Eloy M?ndez
Denis O'Hare
Bob Olin
Tim Patteron
Vu Pham
Diego Sanchez
Zach Sanchez
Brennan Sprecher
Dean Stockwell
Corey Stoll
Tyron Strickland
Jeremy Evan Taylor
Lance Weldon
Gloria Alvarez
Lara Baker
Katy Beckemeyer
Troian Bellisario
Kim Bissett
Ellen Bloodworth
Dale Dickey
Beth Furumasu
Keiko Green
Julie Groff
Teresa Wells Jones
Karli Klein
Katie Klein
Blake Lindsley
Marvella McPartland
Dana Millican
Jennifer Oswald
Tyra Richards
Jewel Robinson
Asha Sawyer
Cami Sturm
Casey Wilson
title : Three Days of Rain
year : 2002
cast :
Erick Avari
Alimi Ballard
Joey Bilow
Bruce Bohne
Robert Carradine
Robert Casserly
Chuck Cooper
Keir Dullea
Peter Falk
Mark Feuerstein
Peter Kalos
George Kuchar
Lyle Lovett
John Carroll Lynch
Don Meredith
Jason Patric
Max Perlich
Wayne Rogers
Michael Santoro
Peter Henry Schroeder
Bill Stockton
Penelope Allen
Laurie Coleman
Blythe Danner
Jordan Elliott
Heather Kafka
Christine Karl
Merle Kennedy
Claire Kirk
Maggie Walker
Look things over and understand (1) the basic approach used to parse the information from each line and (2) how the nested struct movie
allows the films
class to create the std::vector<movie> movies;
to hold all information for all movies. How you put the pieces together is up to you, this just shows one basic approach. Let me know if you have questions.