본문 바로가기
Cloud/AWS

[2021.07.23] 인턴 +144 How to upload multiple files from Linux(Ubuntu 20.04) to AWS S3 using python(boto3)? (+ cron 1초 마다 EC2 -> AWS S3 동기화)

by injekim97 2021. 7. 23.
반응형

[2021.07.23] 인턴 +144  How to upload multiple files from Linux(Ubuntu 20.04) to AWS S3 using python(boto3)? (+ cron 1초 마다 EC2 -> AWS S3 동기화)

 

 

 

 

해당 게시글은, AWS EC2 Linux -> AWS S3로 여러 파일을 업로드 하는 방법에 대해 알아 보도록 하자.

 

 

 

ubuntu@ip-172-31-9-174:~$ python3 -m pip install boto3

Collecting boto3
  Downloading boto3-1.18.5.tar.gz (102 kB)
     |████████████████████████████████| 102 kB 1.7 MB/s
Collecting botocore<1.22.0,>=1.21.5
  Downloading botocore-1.21.5-py3-none-any.whl (7.7 MB)
     |████████████████████████████████| 7.7 MB 2.6 MB/s
Collecting jmespath<1.0.0,>=0.7.1
  Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting s3transfer<0.6.0,>=0.5.0
  Downloading s3transfer-0.5.0-py3-none-any.whl (79 kB)
     |████████████████████████████████| 79 kB 12.8 MB/s
Collecting python-dateutil<3.0.0,>=2.1
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     |████████████████████████████████| 247 kB 64.3 MB/s
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore<1.22.0,>=1.21.5->boto3) (1.25.8)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.22.0,>=1.21.5-
Building wheels for collected packages: boto3
  Building wheel for boto3 (setup.py) ... done
  Created wheel for boto3: filename=boto3-1.18.5-py3-none-any.whl size=128958 sha256=326c97492d68a83486f6c6b8c034ed66e0358fcd6c71b7502
  Stored in directory: /home/ubuntu/.cache/pip/wheels/2f/31/70/964258e8593ba28258a61b66403a19e702163f3dc027c811cd
Successfully built boto3



boto3를 설치 한 후에, 바로 실행하면 해당 에러 발생

python3 local_pc_to_aws_S3_upload.py

Error : Unable to locate credentials

 

 

 


ubuntu@ip-172-31-9-174:~$ sudo apt install awscli

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  docutils-common fontconfig fonts-droid-fallback fonts-noto-mono fonts-urw-base35 ghostscript groff gsfonts hicolor-icon-theme imagemagick imagemagick-6-common
  imagemagick-6.q16 libcairo2 libdatrie1 libdjvulibre-text libdjvulibre21 libfftw3-double3 libgs9 libgs9-common libidn11 libijs-0.35 libilmbase24 libimagequant0
  libjbig0 libjbig2dec0 liblqr-1-0 libmagickcore-6.q16-6 libmagickcore-6.q16-6-extra libmagickwand-6.q16-6 libnetpbm10 libopenexr24 libopenjp2-7 libpango-1.0-0
  libpangocairo-1.0-0 libpangoft2-1.0-0 libpaper-utils libpaper1 libpixman-1-0 libthai-data libthai0 libtiff5 libwebp6 libwebpdemux2 libwebpmux3 libwmf0.2-7
  libxcb-render0 libxcb-shm0 netpbm poppler-data psutils python3-botocore python3-dateutil python3-docutils python3-jmespath python3-olefile python3-pil
  python3-pygments python3-roman python3-rsa python3-s3transfer sgml-base xml-core
Suggested packages:
  fonts-noto fonts-freefont-otf | fonts-freefont-ttf fonts-texgyre ghostscript-x imagemagick-doc autotrace cups-bsd | lpr | lprng enscript ffmpeg gimp gnuplot
  grads graphviz hp2xx html2ps libwmf-bin mplayer povray radiance sane-utils texlive-base-bin transfig ufraw-batch xdg-utils libfftw3-bin libfftw3-dev inkscape
  libjxr-tools libwmf0.2-7-gtk poppler-utils fonts-japanese-mincho | fonts-ipafont-mincho fonts-japanese-gothic | fonts-ipafont-gothic fonts-arphic-ukai
  fonts-arphic-uming fonts-nanum docutils-doc fonts-linuxlibertine | ttf-linux-libertine texlive-lang-french texlive-latex-base texlive-latex-recommended
  python-pil-doc python3-pil-dbg python-pygments-doc ttf-bitstream-vera sgml-base-doc debhelper
The following NEW packages will be installed:
  awscli docutils-common fontconfig fonts-droid-fallback fonts-noto-mono fonts-urw-base35 ghostscript groff gsfonts hicolor-icon-theme imagemagick
  imagemagick-6-common imagemagick-6.q16 libcairo2 libdatrie1 libdjvulibre-text libdjvulibre21 libfftw3-double3 libgs9 libgs9-common libidn11 libijs-0.35
  libilmbase24 libimagequant0 libjbig0 libjbig2dec0 liblqr-1-0 libmagickcore-6.q16-6 libmagickcore-6.q16-6-extra libmagickwand-6.q16-6 libnetpbm10 libopenexr24
  libopenjp2-7 libpango-1.0-0 libpangocairo-1.0-0 libpangoft2-1.0-0 libpaper-utils libpaper1 libpixman-1-0 libthai-data libthai0 libtiff5 libwebp6 libwebpdemux2
  libwebpmux3 libwmf0.2-7 libxcb-render0 libxcb-shm0 netpbm poppler-data psutils python3-botocore python3-dateutil python3-docutils python3-jmespath
  python3-olefile python3-pil python3-pygments python3-roman python3-rsa python3-s3transfer sgml-base xml-core
0 upgraded, 63 newly installed, 0 to remove and 14 not upgraded.
Need to get 33.5 MB of archives.
After this operation, 161 MB of additional disk space will be used.
Do you want to continue? [Y/n] y

 

 

ubuntu@ip-172-31-9-174:~$ pip3 install awscli --upgrade --user

Collecting awscli
  Downloading awscli-1.20.5-py3-none-any.whl (3.7 MB)
     |████████████████████████████████| 3.7 MB 1.7 MB/s
Requirement already satisfied, skipping upgrade: s3transfer<0.6.0,>=0.5.0 in ./.local/lib/python3.8/site-packages (from awscli) (0.5.0)
Requirement already satisfied, skipping upgrade: PyYAML<5.5,>=3.10 in /usr/lib/python3/dist-packages (from awscli) (5.3.1)
Collecting docutils<0.16,>=0.10
  Downloading docutils-0.15.2-py3-none-any.whl (547 kB)
     |████████████████████████████████| 547 kB 43.1 MB/s
Requirement already satisfied, skipping upgrade: colorama<0.4.4,>=0.2.5 in /usr/lib/python3/dist-packages (from awscli) (0.4.3)
Requirement already satisfied, skipping upgrade: rsa<4.8,>=3.1.2 in /usr/lib/python3/dist-packages (from awscli) (4.0)
Requirement already satisfied, skipping upgrade: botocore==1.21.5 in ./.local/lib/python3.8/site-packages (from awscli) (1.21.5)
Requirement already satisfied, skipping upgrade: urllib3<1.27,>=1.25.4 in /usr/lib/python3/dist-packages (from botocore==1.21.5->awscli) (1.25.8)
Requirement already satisfied, skipping upgrade: jmespath<1.0.0,>=0.7.1 in ./.local/lib/python3.8/site-packages (from botocore==1.21.5->awscli) (0.10.0)
Requirement already satisfied, skipping upgrade: python-dateutil<3.0.0,>=2.1 in ./.local/lib/python3.8/site-packages (from botocore==1.21.5->awscli) (2.8.2)
Requirement already satisfied, skipping upgrade: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil<3.0.0,>=2.1->botocore==1.21.5->awscli) (1.14.0)
Installing collected packages: docutils, awscli
Successfully installed awscli-1.20.5 docutils-0.15.2

 

 

 

 

 

ubuntu@crawling:~/linux_s3_upload_dir$ aws --version

aws-cli/1.20.5 Python/3.8.10 Linux/5.8.0-1036-azure botocore/1.21.5

 

 

 



★sudo aws configure 가 아닌, aws coonfigure로 하자 ★
ubuntu@ip-172-31-9-174:~$ aws configure

AWS Access Key ID [None]: AKIAV4FRRCI65BXBR647
AWS Secret Access Key [None]: 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
Default region name [None]: ap-northeast-2
Default output format [None]: json

 

 

 

--------------------------------------------------------------------------------------------------------------------------

자, 이제 설정을 완료 했으니, 파이썬 코드를 본 후, S3에 파일이 있는지 확인한 후에 실행 해 보겠다.

 

 

 

<파이썬 코드를 실행하기 전 Linux 디렉토리와, S3 버킷 확인>

 

 

 

ubuntu@crawling:~/linux_s3_upload_dir$ ls
0723_Linux_to_aws_S3_upload_succes.py  after1day.log  lawtalk_1day.log

 

 

 

 

-> 보면 AWS S3가 비어 있다.

 

 

 

Linux_to_AWS S3 upload.py

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os
import boto3




def file_upload_s3():

    s3_bucket = "elk-data-storage"
    s3 = boto3.resource('s3')

    dir_path = []   # dir_path 은 local PC에 업로드 할 경로를 저장하기 위함 용도
    file_name = []  # file_name 은 업로드 할 경로에 파일 이름을 저장하기 위한 용도




    directory_path = "/home/ubuntu/linux_s3_upload_dir"       # ★★★★★ 업로드 할 파일 경로를 지정 ★★★★★
    for file in os.listdir(directory_path):
        path = os.path.join(directory_path,file) # C:\Users\injekim97\Desktop\IAM_USER_ELK+ {파일 이름} 식으로 붙어짐
        dir_path.append(path)
        file_name.append(file)
        print(path)
        print(dir_path)
        print(file_name)


    #  ★★★★★ range는 업로드 할 파일에 수에 따라 range(수)를 부여함 ★★★★★
    # e.g : 5개면 5를 부여해야함 0~4 총 5개
    # ★★★★★ 일부로 try , except 문을 사용해서 총 999개의 해당 경로의 파일을 업로드 하게끔 함. ★★★★★

    try :
        for i in range(999):
            s3.meta.client.upload_file(dir_path[i], s3_bucket,file_name[i])

    except Exception as e :
        print(e)

    print("Local PC -> AWS S3 파일 업로드를 완료하였습니다.")

file_upload_s3()

 

 

 

<실행 결과>

ubuntu@crawling:~/linux_s3_upload_dir$ python3 0723_Linux_to_aws_S3_upload_succes.py

['0723_Linux_to_aws_S3_upload_succes.py', 'after1day.log', 'lawtalk_1day.log']
list index out of range
Local PC -> AWS S3 파일 업로드를 완료하였습니다.
ubuntu@crawling:~/linux_s3_upload_dir$

 

 

<성공>

 

 

--------------------------------------------------------------------------------------------------------------------

cron 1초 마다 EC2 -> AWS S3 동기화

ubuntu@ip-172-31-9-174:~$ cat s3_upload.sh
python3 /home/ubuntu/local_pc_to_aws_S3_upload.py

ubuntu@ip-172-31-9-174:~$ crontab -l
* * * * * /home/ubuntu/s3_upload.sh > s3_upload.log

 

 

crontab -e 명령어를 사용하여, cron을 수정한 뒤, 꼭 restart를 해줘야 적용이 됨 

ubuntu@ip-172-31-9-174:~$ sudo service cron restart
ubuntu@ip-172-31-9-174:~$ systemctl status cron

● cron.service - Regular background program processing daemon
     Loaded: loaded (/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2021-07-26 12:49:00 KST; 6min ago
       Docs: man:cron(8)
   Main PID: 50764 (cron)
      Tasks: 1 (limit: 9536)
     Memory: 1.6M
     CGroup: /system.slice/cron.service
             └─50764 /usr/sbin/cron -f

Jul 26 12:52:02 ip-172-31-9-174 CRON[50832]: pam_unix(cron:session): session closed for user ubuntu
Jul 26 12:53:01 ip-172-31-9-174 CRON[50859]: pam_unix(cron:session): session opened for user ubuntu by (uid=0)
Jul 26 12:53:01 ip-172-31-9-174 CRON[50860]: (ubuntu) CMD (/home/ubuntu/s3_upload.sh > s3_upload.log)
Jul 26 12:53:03 ip-172-31-9-174 CRON[50859]: pam_unix(cron:session): session closed for user ubuntu
Jul 26 12:54:01 ip-172-31-9-174 CRON[50889]: pam_unix(cron:session): session opened for user ubuntu by (uid=0)
Jul 26 12:54:01 ip-172-31-9-174 CRON[50890]: (ubuntu) CMD (/home/ubuntu/s3_upload.sh > s3_upload.log)
Jul 26 12:54:02 ip-172-31-9-174 CRON[50889]: pam_unix(cron:session): session closed for user ubuntu
Jul 26 12:55:01 ip-172-31-9-174 CRON[50944]: pam_unix(cron:session): session opened for user ubuntu by (uid=0)
Jul 26 12:55:01 ip-172-31-9-174 CRON[50945]: (ubuntu) CMD (/home/ubuntu/s3_upload.sh > s3_upload.log)
Jul 26 12:55:03 ip-172-31-9-174 CRON[50944]: pam_unix(cron:session): session closed for user ubuntu

 

--------------------------------------------------------------------------------------------------------------------

Linux 에서 사용할 수 있는 AWS S3 명령어

 

# AWS S3 버킷 보기

ubuntu@crawling:~/linux_s3_upload_dir$ aws s3 ls

2021-03-18 18:26:08 allthatfabric
2021-03-25 15:42:42 allthatfabric-file
2021-07-21 16:58:23 elk-data-storage
2021-04-24 12:39:52 mychrnoicdisease
2021-06-11 12:07:13 s3-06111207


# AWS S3 파일 내에 삭제 하기

ubuntu@ip-172-31-9-174:~/data$ aws s3 rm s3://elk-data-storage/test.11
delete: s3://elk-data-storage/test.11


 

 

# Linux에서 지우고, 파일 내에 삭제해줘야함(위에서 1초마다 동기화 하기 때문)

ubuntu@ip-172-31-9-174:~/data$ rm -rf test.11
ubuntu@ip-172-31-9-174:~/data$ ls
0723_waist.csv  air_quality.csv  국가건강검진_혈압혈당데이터.csv  국가건강검진_혈액검사데이터.csv



반응형

댓글