I have a python script which counts the words for a given file and saves the output to a "result.txt" file after the execution. I want my docker container to do this as the container starts and display the output the console. Below is my docker file and python file
FROM python:3
RUN mkdir /home/data
RUN mkdir /home/output
RUN touch /home/output/result.txt
WORKDIR /home/code
COPY word_counter.py ./
CMD ["python", "word_counter.py"]
ENTRYPOINT cat ../output/result.txt
import glob
import os
from collections import OrderedDict
import socket
from pathlib import Path
dir_path = os.path.dirname(os.path.realpath(__file__))
# print(type(dir_path))
parent_path = Path(dir_path).parent
data_path = str(parent_path) "/data"
# print(data_path)
os.chdir(data_path)
myFiles = glob.glob('*.txt')
output = open("../output/result.txt", "w")
output.write("files in home/data are : ")
output.write('\n')
for x in myFiles :
output.write(x)
output.write('\n')
output.close()
total_words = 0
for x in myFiles :
file = open(x, "r")
data = file.read()
words = data.split()
total_words = total_words len(words)
file.close()
output = open("../output/result.txt", "a")
output.write("Total number of words in both the files : " str(total_words))
output.write('\n')
output.close()
frequency = {}
for x in myFiles :
if x == "IF.txt" :
curr_file = x
document_text = open(curr_file, 'r')
text_string = document_text.read()
words = text_string.split()
for word in words:
count = frequency.get(word,0)
frequency[word] = count 1
frequency_list_desc_order = sorted(frequency, key=frequency.get, reverse=True)
output = open("../output/result.txt", "a")
output.write("Top 3 words in IF.txt are :")
output.write('\n')
ip_addr = socket.gethostbyname(socket.gethostname())
for word in frequency_list_desc_order[:3]:
line = word " : " str(frequency[word])
output.write(line)
output.write('\n')
output.write("ip address of the machine : " ip_addr "\n")
output.close()
I am mapping a local directory which has two text files IF.txt and Limerick1.txt from the host machine to the directory "/home/data" inside the container and the python code inside the container reads the files and saves the output to result.txt in "home/output" inside the container.
I want my container to print the output in "result.txt" to the console when I start the container using the docker run command.
Issue: docker does not execute the following statement when starting a container using docker run.
CMD ["python", "word_counter.py"]
command to run the container:
docker run -it -v /Users/xyz/Desktop/project/docker:/home/data proj2docker bash
But when I run the same command "python word_counter.py" from within the container it executes perfectly fine.
can someone help me with this?
CodePudding user response:
You have an entrypoint in your Dockerfile. This entrypoint will run and basically take the CMD as additional argument(s).
The final command that you run when starting the container looks like this
cat ../output/result.txt python word_counter.py
This is likely not what you want. I suggest removing that entrypoint. Or fix it according to your needs.
If you want to print that file and still execute that command, you can do something like the below.
CMD ["python", "word_counter.py"]
ENTRYPOINT ["/bin/sh", "-c", "cat ../output/result.txt; exec $@"]
It will run some command(s) as entrypoint, in this case printing the output of that file, and after that execute the CMD which is available as $@
as its standard posix shell behaviour. In any shell script it would work the same to access all arguments that were passed to the script. The benefit of using exec here is that it will run python with process id 1, which is useful when you want to send signals into the container to the python process, for example kill.
Lastly, when you start the container with the command you show
docker run -it -v /Users/xyz/Desktop/project/docker:/home/data proj2docker bash
You are overriding the CMD in the Dockerfile. So in that case, it is expected that it doesn't run python. Even if your entrypoint didn't have the former mentioned issue. If you want to always run the python program, then you need to make that part of the entrypoint. The problem you would have is that it would first run the entrypoint until it finishes and then your command, in this case bash.
You could run it in the background, if that's what you want. Note that there is no default CMD, but still the exec $@
which will allow you to run an arbitrary command such as bash while python is running in the background.
ENTRYPOINT ["/bin/sh", "-c", "cat ../output/result.txt; python word_counter.py &; exec $@"]
If you do a lot of work in the entrypoint it is probably cleaner to move this to a dedicated script and run this script as entrypoint, you can still call exec $@ at the end of your shell script.
According to your comment, you want to run python first and then cat on the file. You could drop the entrypoint and do it just with the command.
CMD ["/bin/sh", "-c", "python word_counter.py && cat ../output/result.txt"]