Carerix Datasource ophalen/koppelen

FileZilla client
FTP gebruikersnaam
FTP wachtwoord
IP adres waarvandaan FileZilla client wordt gebruikt dient op onze whitelist te staan
In het geval dat er een uitgaande firewall wordt gebruikt; voor IP adres 3.121.45.103 dienen TCP poorten 21 en 21000 tot en met 21010 toegestaan te zijn

3. WinScp client command line interface (FTP server)

Besturingssysteem: Windows

Getest op Windows 11 met WinScp 6.1.1

Benodigd:

WinScp (Typical install)
FTP gebruikersnaam
FTP wachtwoord
IP adres waarvandaan WinScp wordt gebruikt dient op onze whitelist te staan
In het geval dat er een uitgaande firewall wordt gebruikt; voor IP adres 3.121.45.103 dienen TCP poorten 21 en 21000 tot en met 21010 toegestaan te zijn

Open command prompt (cmd.exe)

"C:\Program Files (x86)\WinSCP\WinSCP.com"

winscp> open ftpes://carerix.user1:xxxxxx@datasource.carerix.net:21

Connecting to datasource.carerix.net ...

TLS connection established. Waiting for welcome message...

Connected

Starting the session...

Session started.

Active session: [1] xxxxxx.user1@datasource.carerix.net

winscp> ls crmatch*

D--------- 0 0 ..

---------- 0 23245 Aug 20 1:22:28 2023 Cragency.csv

...

winscp> get crcompany.csv c:\temp\

crcompany.csv | 36260 KB | 2131.9 KB/s | binary | 100%

winscp> exit

als commando in een batch file

C:\Program Files (x86)\WinSCP\WinSCP.com /command ^

"ftpes://carerix.user1:xxxxxx@datasource.carerix.net:21" ^

"get *.csv c:\temp\" ^

"exit"

4. Cyberduck GUI (S3 bucket)

Besturingssysteem: Linux, Windows en macOS

Getest op macOS 15.2 met Cyberduck client 9.1.0

Documentation

Benodigd:

AWS CLI
AWS S3 bucket naam: datasource-klantnaam
AWS access key id
AWS secret access key
AWS region: eu-central-1 (Frankfurt)

5. PowerBI desktop Python connector (S3 bucket)

Haalt de data van één specifiek S3 bucket bestand op.

Besturingssysteem: Windows

Getest met PowerBI Desktop 2.120.731.0 en Python 3.11.4

Benodigd:

PowerBI Desktop
Python 3, pip en enkele python modules
AWS S3 bucket naam: datasource-klantnaam
AWS access key id
AWS secret access key
AWS region: eu-central-1 (Frankfurt)

Open command prompt (cmd.exe)

# ga naar de folder waar python is geïnstalleerd

cd C:\Users\xxxxxx\AppData\Local\Programs\Python\Python311

# installeer python package manager pip

curl https://bootstrap.pypa.io/ez_setup.py | python

curl https://bootstrap.pypa.io/get-pip.py | python

# installeer de benodigde python modules

Scripts\pip.exe install boto3 matplotlib pandas

Open PowerBI Desktop > Home > Get data > more > Python

Crcompany.csv voorbeeld

import boto3, os, io

import pandas as pd

my_key= 'xxxxxx'

my_secret= 'xxxxxx'

my_bucket_name = 'datasource-xxxxxx'

my_file = 'Crcompany.csv'

session = boto3.Session(aws_access_key_id=my_key,aws_secret_access_key=my_secret)

s3 = session.resource('s3')

bucket = s3.Bucket(my_bucket_name)

bucket.download_file(my_file, my_file)

crcompany = pd.read_csv(my_file, delimiter='\t')

6. AWS command line interface (S3 bucket)

Besturingssysteem: Linux, Windows en macOS

Getest op macOS 15.2 met AWS CLI 2.22.17

Documentatie

Benodigd:

AWS CLI
AWS S3 bucket naam: datasource-klantnaam
AWS access key id
AWS secret access key
AWS region: eu-central-1 (Frankfurt)

# macOS voorbeeld

$ which aws

/usr/local/bin/aws

$ aws --version

aws-cli/2.2.23 Python/3.8.8 Darwin/22.6.0 exe/x86_64 prompt/of

# maak een default profiel aan

$ aws configure

AWS Access Key ID [None]: xxxxxx

AWS Secret Access Key [None]: xxxxxx

Default region name [None]: eu-central-1

Default output format [None]: json

$ aws s3 ls s3://datasource-klantnaam/

2023-08-19 03:32:06 8456 Cragency.csv

2023-08-19 03:32:06 4579 Crarticle.csv

2023-08-19 03:32:06 153351138 Crattachment.csv

2023-08-19 03:32:08 509534996 Crattributechange.csv

...

# synchroniseer alle csv bestanden naar de lokale folder /var/tmp

# download alleen bestanden indien het bestand gewijzigd is; dit voorkomt onnodig downloaden van bestanden

$ aws s3 sync s3://datasource-klantnaam/ /var/tmp --include "*.csv"