English to german Translator

WHAT IS IT?

Language translation From English to German. It is done using Sequence to sequence models with multiple RNN. RNNs are designed to take sequences of text as inputs or return sequences of text as outputs, or both. They’re called recurrent because the network’s hidden layers have a loop in which the output and cell state from each time step become inputs at the next time step. This recurrence serves as a form of memory. Below is a summary of the various preprocessing and modeling steps. The high-level steps include: Preprocessing: load and examine data, cleaning, tokenization, padding. Modeling: build, train, and test the model. Prediction: generate specific translations of English to French, and compare the output translations to the ground truth translations. Iteration: iterate on the model, experimenting with different architectures.

HOW TO USE?

To run the inference script run the command python run.py -model german/averaged-10-epoch.pt -src input/test.txt -output pred.txt -replace_unk -verbose

To get the other optional Options and help options run python run.py -h

usage:

run.py [-h] [-config CONFIG] [-save_config SAVE_CONFIG] --model MODEL [MODEL ...] [--fp32] [--avg_raw_probs] [--data_type DATA_TYPE] --src SRC [--src_dir SRC_DIR] [--tgt TGT] [--shard_size SHARD_SIZE] [--output OUTPUT] [--report_bleu] [--report_rouge] [--report_time] [--dynamic_dict] [--share_vocab] [--random_sampling_topk RANDOM_SAMPLING_TOPK] [--random_sampling_temp RANDOM_SAMPLING_TEMP] [--seed SEED] [--beam_size BEAM_SIZE] [--min_length MIN_LENGTH] [--max_length MAX_LENGTH] [--max_sent_length] [--stepwise_penalty] [--length_penalty {none,wu,avg}] [--ratio RATIO] [--coverage_penalty {none,wu,summary}] [--alpha ALPHA] [--beta BETA] [--block_ngram_repeat BLOCK_NGRAM_REPEAT] [--ignore_when_blocking IGNORE_WHEN_BLOCKING [IGNORE_WHEN_BLOCKING ...]] [--replace_unk] [--phrase_table PHRASE_TABLE] [--verbose] [--log_file LOG_FILE] [--log_file_level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET,50,40,30,20,10,0}] [--attn_debug] [--dump_beam DUMP_BEAM] [--n_best N_BEST] [--batch_size BATCH_SIZE] [--batch_type {sents,tokens}] [--gpu GPU] [--sample_rate SAMPLE_RATE] [--window_size WINDOW_SIZE] [--window_stride WINDOW_STRIDE] [--window WINDOW [--image_channel_size {3,1}]

WHAT ARE THE REQUIREMENTS?

To get all the requirements and dependencies installed run the command For GPU - pip install -r gpu_requirements.txt For CPU - pip install -r cpu_requirements.txt

Framework	PyTorch
OS Used	Linux