[2021.07.23] 인턴 +144 How to upload multiple files from Linux(Ubuntu 20.04) to AWS S3 using python(boto3)? (+ cron 1초 마다 EC2 -> AWS S3 동기화)
해당 게시글은, AWS EC2 Linux -> AWS S3로 여러 파일을 업로드 하는 방법에 대해 알아 보도록 하자.
ubuntu@ip-172-31-9-174:~$ python3 -m pip install boto3
Collecting boto3
Downloading boto3-1.18.5.tar.gz (102 kB)
|████████████████████████████████| 102 kB 1.7 MB/s
Collecting botocore<1.22.0,>=1.21.5
Downloading botocore-1.21.5-py3-none-any.whl (7.7 MB)
|████████████████████████████████| 7.7 MB 2.6 MB/s
Collecting jmespath<1.0.0,>=0.7.1
Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting s3transfer<0.6.0,>=0.5.0
Downloading s3transfer-0.5.0-py3-none-any.whl (79 kB)
|████████████████████████████████| 79 kB 12.8 MB/s
Collecting python-dateutil<3.0.0,>=2.1
Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
|████████████████████████████████| 247 kB 64.3 MB/s
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.22.0,>=1.21.5->boto3) (1.25.8)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.22.0,>=1.21.5-
Building wheels for collected packages: boto3
Building wheel for boto3 (setup.py) ... done
Created wheel for boto3: filename=boto3-1.18.5-py3-none-any.whl size=128958 sha256=326c97492d68a83486f6c6b8c034ed66e0358fcd6c71b7502
Stored in directory: /home/ubuntu/.cache/pip/wheels/2f/31/70/964258e8593ba28258a61b66403a19e702163f3dc027c811cd
Successfully built boto3
boto3를 설치 한 후에, 바로 실행하면 해당 에러 발생
python3 local_pc_to_aws_S3_upload.py
Error : Unable to locate credentials
ubuntu@ip-172-31-9-174:~$ sudo apt install awscli
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
docutils-common fontconfig fonts-droid-fallback fonts-noto-mono fonts-urw-base35 ghostscript groff gsfonts hicolor-icon-theme imagemagick imagemagick-6-common
imagemagick-6.q16 libcairo2 libdatrie1 libdjvulibre-text libdjvulibre21 libfftw3-double3 libgs9 libgs9-common libidn11 libijs-0.35 libilmbase24 libimagequant0
libjbig0 libjbig2dec0 liblqr-1-0 libmagickcore-6.q16-6 libmagickcore-6.q16-6-extra libmagickwand-6.q16-6 libnetpbm10 libopenexr24 libopenjp2-7 libpango-1.0-0
libpangocairo-1.0-0 libpangoft2-1.0-0 libpaper-utils libpaper1 libpixman-1-0 libthai-data libthai0 libtiff5 libwebp6 libwebpdemux2 libwebpmux3 libwmf0.2-7
libxcb-render0 libxcb-shm0 netpbm poppler-data psutils python3-botocore python3-dateutil python3-docutils python3-jmespath python3-olefile python3-pil
python3-pygments python3-roman python3-rsa python3-s3transfer sgml-base xml-core
Suggested packages:
fonts-noto fonts-freefont-otf | fonts-freefont-ttf fonts-texgyre ghostscript-x imagemagick-doc autotrace cups-bsd | lpr | lprng enscript ffmpeg gimp gnuplot
grads graphviz hp2xx html2ps libwmf-bin mplayer povray radiance sane-utils texlive-base-bin transfig ufraw-batch xdg-utils libfftw3-bin libfftw3-dev inkscape
libjxr-tools libwmf0.2-7-gtk poppler-utils fonts-japanese-mincho | fonts-ipafont-mincho fonts-japanese-gothic | fonts-ipafont-gothic fonts-arphic-ukai
fonts-arphic-uming fonts-nanum docutils-doc fonts-linuxlibertine | ttf-linux-libertine texlive-lang-french texlive-latex-base texlive-latex-recommended
python-pil-doc python3-pil-dbg python-pygments-doc ttf-bitstream-vera sgml-base-doc debhelper
The following NEW packages will be installed:
awscli docutils-common fontconfig fonts-droid-fallback fonts-noto-mono fonts-urw-base35 ghostscript groff gsfonts hicolor-icon-theme imagemagick
imagemagick-6-common imagemagick-6.q16 libcairo2 libdatrie1 libdjvulibre-text libdjvulibre21 libfftw3-double3 libgs9 libgs9-common libidn11 libijs-0.35
libilmbase24 libimagequant0 libjbig0 libjbig2dec0 liblqr-1-0 libmagickcore-6.q16-6 libmagickcore-6.q16-6-extra libmagickwand-6.q16-6 libnetpbm10 libopenexr24
libopenjp2-7 libpango-1.0-0 libpangocairo-1.0-0 libpangoft2-1.0-0 libpaper-utils libpaper1 libpixman-1-0 libthai-data libthai0 libtiff5 libwebp6 libwebpdemux2
libwebpmux3 libwmf0.2-7 libxcb-render0 libxcb-shm0 netpbm poppler-data psutils python3-botocore python3-dateutil python3-docutils python3-jmespath
python3-olefile python3-pil python3-pygments python3-roman python3-rsa python3-s3transfer sgml-base xml-core
0 upgraded, 63 newly installed, 0 to remove and 14 not upgraded.
Need to get 33.5 MB of archives.
After this operation, 161 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
ubuntu@ip-172-31-9-174:~$ pip3 install awscli --upgrade --user
Collecting awscli
Downloading awscli-1.20.5-py3-none-any.whl (3.7 MB)
|████████████████████████████████| 3.7 MB 1.7 MB/s
Requirement already satisfied, skipping upgrade: s3transfer<0.6.0,>=0.5.0 in ./.local/lib/python3.8/site-packages (from awscli) (0.5.0)
Requirement already satisfied, skipping upgrade: PyYAML<5.5,>=3.10 in /usr/lib/python3/dist-packages (from awscli) (5.3.1)
Collecting docutils<0.16,>=0.10
Downloading docutils-0.15.2-py3-none-any.whl (547 kB)
|████████████████████████████████| 547 kB 43.1 MB/s
Requirement already satisfied, skipping upgrade: colorama<0.4.4,>=0.2.5 in /usr/lib/python3/dist-packages (from awscli) (0.4.3)
Requirement already satisfied, skipping upgrade: rsa<4.8,>=3.1.2 in /usr/lib/python3/dist-packages (from awscli) (4.0)
Requirement already satisfied, skipping upgrade: botocore==1.21.5 in ./.local/lib/python3.8/site-packages (from awscli) (1.21.5)
Requirement already satisfied, skipping upgrade: urllib3<1.27,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore==1.21.5->awscli) (1.25.8)
Requirement already satisfied, skipping upgrade: jmespath<1.0.0,>=0.7.1 in ./.local/lib/python3.8/site-packages (from botocore==1.21.5->awscli) (0.10.0)
Requirement already satisfied, skipping upgrade: python-dateutil<3.0.0,>=2.1 in ./.local/lib/python3.8/site-packages (from botocore==1.21.5->awscli) (2.8.2)
Requirement already satisfied, skipping upgrade: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil<3.0.0,>=2.1->botocore==1.21.5->awscli) (1.14.0)
Installing collected packages: docutils, awscli
Successfully installed awscli-1.20.5 docutils-0.15.2
ubuntu@crawling:~/linux_s3_upload_dir$ aws --version
aws-cli/1.20.5 Python/3.8.10 Linux/5.8.0-1036-azure botocore/1.21.5
★sudo aws configure 가 아닌, aws coonfigure로 하자 ★
ubuntu@ip-172-31-9-174:~$ aws configure
AWS Access Key ID [None]: AKIAV4FRRCI65BXBR647
AWS Secret Access Key [None]: 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
Default region name [None]: ap-northeast-2
Default output format [None]: json
--------------------------------------------------------------------------------------------------------------------------
자, 이제 설정을 완료 했으니, 파이썬 코드를 본 후, S3에 파일이 있는지 확인한 후에 실행 해 보겠다.
<파이썬 코드를 실행하기 전 Linux 디렉토리와, S3 버킷 확인>
ubuntu@crawling:~/linux_s3_upload_dir$ ls
0723_Linux_to_aws_S3_upload_succes.py after1day.log lawtalk_1day.log
-> 보면 AWS S3가 비어 있다.
Linux_to_AWS S3 upload.py
#!/usr/bin/env python
# coding: utf-8
# In[1]:
import os
import boto3
def file_upload_s3():
s3_bucket = "elk-data-storage"
s3 = boto3.resource('s3')
dir_path = [] # dir_path 은 local PC에 업로드 할 경로를 저장하기 위함 용도
file_name = [] # file_name 은 업로드 할 경로에 파일 이름을 저장하기 위한 용도
directory_path = "/home/ubuntu/linux_s3_upload_dir" # ★★★★★ 업로드 할 파일 경로를 지정 ★★★★★
for file in os.listdir(directory_path):
path = os.path.join(directory_path,file) # C:\Users\injekim97\Desktop\IAM_USER_ELK+ {파일 이름} 식으로 붙어짐
dir_path.append(path)
file_name.append(file)
print(path)
print(dir_path)
print(file_name)
# ★★★★★ range는 업로드 할 파일에 수에 따라 range(수)를 부여함 ★★★★★
# e.g : 5개면 5를 부여해야함 0~4 총 5개
# ★★★★★ 일부로 try , except 문을 사용해서 총 999개의 해당 경로의 파일을 업로드 하게끔 함. ★★★★★
try :
for i in range(999):
s3.meta.client.upload_file(dir_path[i], s3_bucket,file_name[i])
except Exception as e :
print(e)
print("Local PC -> AWS S3 파일 업로드를 완료하였습니다.")
file_upload_s3()
<실행 결과>
ubuntu@crawling:~/linux_s3_upload_dir$ python3 0723_Linux_to_aws_S3_upload_succes.py
['0723_Linux_to_aws_S3_upload_succes.py', 'after1day.log', 'lawtalk_1day.log']
list index out of range
Local PC -> AWS S3 파일 업로드를 완료하였습니다.
ubuntu@crawling:~/linux_s3_upload_dir$
<성공>
--------------------------------------------------------------------------------------------------------------------
cron 1초 마다 EC2 -> AWS S3 동기화
ubuntu@ip-172-31-9-174:~$ cat s3_upload.sh
python3 /home/ubuntu/local_pc_to_aws_S3_upload.py
ubuntu@ip-172-31-9-174:~$ crontab -l
* * * * * /home/ubuntu/s3_upload.sh > s3_upload.log
crontab -e 명령어를 사용하여, cron을 수정한 뒤, 꼭 restart를 해줘야 적용이 됨
ubuntu@ip-172-31-9-174:~$ sudo service cron restart
ubuntu@ip-172-31-9-174:~$ systemctl status cron
● cron.service - Regular background program processing daemon
Loaded: loaded (/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2021-07-26 12:49:00 KST; 6min ago
Docs: man:cron(8)
Main PID: 50764 (cron)
Tasks: 1 (limit: 9536)
Memory: 1.6M
CGroup: /system.slice/cron.service
└─50764 /usr/sbin/cron -f
Jul 26 12:52:02 ip-172-31-9-174 CRON[50832]: pam_unix(cron:session): session closed for user ubuntu
Jul 26 12:53:01 ip-172-31-9-174 CRON[50859]: pam_unix(cron:session): session opened for user ubuntu by (uid=0)
Jul 26 12:53:01 ip-172-31-9-174 CRON[50860]: (ubuntu) CMD (/home/ubuntu/s3_upload.sh > s3_upload.log)
Jul 26 12:53:03 ip-172-31-9-174 CRON[50859]: pam_unix(cron:session): session closed for user ubuntu
Jul 26 12:54:01 ip-172-31-9-174 CRON[50889]: pam_unix(cron:session): session opened for user ubuntu by (uid=0)
Jul 26 12:54:01 ip-172-31-9-174 CRON[50890]: (ubuntu) CMD (/home/ubuntu/s3_upload.sh > s3_upload.log)
Jul 26 12:54:02 ip-172-31-9-174 CRON[50889]: pam_unix(cron:session): session closed for user ubuntu
Jul 26 12:55:01 ip-172-31-9-174 CRON[50944]: pam_unix(cron:session): session opened for user ubuntu by (uid=0)
Jul 26 12:55:01 ip-172-31-9-174 CRON[50945]: (ubuntu) CMD (/home/ubuntu/s3_upload.sh > s3_upload.log)
Jul 26 12:55:03 ip-172-31-9-174 CRON[50944]: pam_unix(cron:session): session closed for user ubuntu
--------------------------------------------------------------------------------------------------------------------
Linux 에서 사용할 수 있는 AWS S3 명령어
# AWS S3 버킷 보기
ubuntu@crawling:~/linux_s3_upload_dir$ aws s3 ls
2021-03-18 18:26:08 allthatfabric
2021-03-25 15:42:42 allthatfabric-file
2021-07-21 16:58:23 elk-data-storage
2021-04-24 12:39:52 mychrnoicdisease
2021-06-11 12:07:13 s3-06111207
# AWS S3 파일 내에 삭제 하기
ubuntu@ip-172-31-9-174:~/data$ aws s3 rm s3://elk-data-storage/test.11
delete: s3://elk-data-storage/test.11
# Linux에서 지우고, 파일 내에 삭제해줘야함(위에서 1초마다 동기화 하기 때문)
ubuntu@ip-172-31-9-174:~/data$ rm -rf test.11
ubuntu@ip-172-31-9-174:~/data$ ls
0723_waist.csv air_quality.csv 국가건강검진_혈압혈당데이터.csv 국가건강검진_혈액검사데이터.csv
댓글