Configuration

Location

Problematics

  • How can we create a more efficient way to work with configuration?
  • How can we make the configuration file(s) available globally so that PyFunceble can be run everywhere in the user workspace?

To answer those problematics, we moved the configuration location elsewhere in the place where most users expect to have their configuration file(s).

Repository clone

If you cloned the repository and you’re trying to test from a cloned directory (the one with for example CONTRIBUTING.md) we consider the configuration directory as the current one.

Note

This behavior allows us to not modify the way we develop PyFunceble.

Travis CI

Under Travis CI, we search or initiate the configuration at the directory we are currently located.

Warning

We don’t care about the distribution, as long as the TRAVIS_BUILD_DIR environment variable is set, we search or initiate the configuration in the current directory.

Note

If you want to force the directory where we should work, you can initiate the PYFUNCEBLE_CONFIG_DIR environment variable with the path where we should work.

GitLab CI/CD

Under GitLab CI/CD, we search or initiate the configuration at the directory we are currently located.

Warning

We don’t care about the distribution, as long as the PROJECT_CI and GITLAB_CI environment variables are set, we search or initiate the configuration in the current directory.

Note

If you want to force the directory where we should work, you can initiate the PYFUNCEBLE_CONFIG_DIR environment variable with the path where we should work.

Linux and MacOS (Darwin Kernel)

Under Linux and MacOS, we look for the following directories in their order. If any configuration directory is found, the system proposes you to install them automatically on the first configuration file.

  1. ~/.config/PyFunceble
  2. ~/.PyFunceble
  3. ${PWD}

Note

If the parent directory does not exist, we move to the next possible location in the given order.

This means that under most Linux distributions and MacOS versions, we consider ~/.config/PyFunceble as the configuration location. But if the ~/.config directory does not exist, we fallback to ~/.PyFunceble as the configuration location.

Windows

As mentioned by Pat Altimore’s Blog, we used the Per user configuration files synchronized across domain joined machines via Active Directory Roaming section in order to understand what we should do to find our configuration directory.

Under Windows, we look for the following directories in their order. If any configuration directory is found, the system proposes you to install them automatically on the first configuration file.

  1. %APPDATA%\PyFunceble (environnement variable)
  2. %CD%

Note

%CD% is explained by the set command (set /?):

%CD% - expands to the current directory string.

Note

If the parent directory does not exist, we move to the next possible location in the given order.

This means that under most Windows versions, we consider %APPDATA%\PyFunceble - also know as C:\Users\userName\AppData\Roaming\PyFunceble- as the configuration location. But if the %APPDATA% directory does not exist, we fall back to the current directory as the configuration location.

Custom location

Sometimes, you may find yourself in a position where you absolutely do not want PyFunceble to use its default configuration location.

For that reason, if you set your desired configuration location along with the PYFUNCEBLE_CONFIG_DIR environment variable, we take that location as the (default) configuration location.

Autoconfiguration

Sometimes, you may find yourself in a position that you do not or you can’t answer the question which asks you if you would like to install the default configuration file.

For that reason, if you set PYFUNCEBLE_AUTO_CONFIGURATION as an environment variable with what you want an assignment, we do not ask that question. We simply do what we have to do without asking anything.

Indexes

This page will try to detail each configuration available into .PyFunceble.yaml along with the location of where we are looking for the configuration file.

adblock

Type: boolean

Default value: False

Description: Enable / disable the adblock format decoding.

Note

If this index is set to True, every time we read a given file, we try to extract the elements that are present.

We basically only decode the adblock format.

Note

If this index is set to False, every time we read a given file, we will consider one line as an element to test.

aggressive

Type: boolean

Default value: False

Description: Enable / disable some aggressive settings.

Warning

This option is available but please keep in mind that the some settings which it enable are experimental.

auto_continue

Type: boolean

Default value: True

Description: Enable / disable the auto continue system.

command

Type: string

Default value: ""

Description: Set the command to run before each commit (except the final one).

Note

The parsed command is called only if auto_continue and ci are set to True.

command_before_end

Type: string

Default value: ""

Description: Set the command to run before the final commit.

Note

The parsed command is called only if auto_continue and ci are set to True.

Note

Understand by final commit the commit which will deliver the last element we have to test.

cooldow_time

Type:: float

Default value: null

Description: Set the cooldown time to apply between each test.

Note

This index take only effect from the CLI. Not from the API.

custom_ip

Type: string

Default value: "0.0.0.0"

Description: Set the custom IP to use when we generate a line in the hosts file format.

Note

This index has no effect if generate_hosts is set to False.

days_between_db_retest

Type: integer

Default value: 1

Description: Set the number of day(s) between each retest of the INACTIVE and INVALID elements which are present into inactive_db.json.

Note

This index has no effect if inactive_database is set to False.

days_between_inactive_db_clean

Type: integer

Default value: 28

Description: Set the numbers of days since the introduction of a subject into inactive-db.json for it to qualifies for deletion.

Note

This index has no effect if inactive_database is set to False.

db_type

Type: string

Default value: json

Available values: json, mariadb, mysql

Description: Set the database type to use everytime we create a database.

Note

This feature is applied to the following subsystems:

  • Autocontinue physically located (JSON) at output/continue.json.
  • InactiveDB physically located (JSON) at [config_dir]/inactive_db.json.
  • Mining physically located (JSON) at [config_dir]/mining.json.
  • WhoisDB physically located (JSON) at [config_dir]/whois.json.

debug

Type: boolean

Default value: False

Description: Enable / disable the generation of debug file(s).

Note

This index has no effect if logs is set to False

Warning

Do not touch this index unless you a have good reason to.

Warning

Do not touch this index unless you have been invited to.

dns_lookup_over_tcp

Type: boolean

Default value: False

Description: Make all DNS lookup with TCP instead of UDP.

dns_server

Type: None or list

Default value: null

Description: Set the DNS server(s) to work with.

Note

When a list is given the following format is expected.

dns_server:
  - dns1.example.org
  - dns2.example.org

Note

You can specify a port number to use to the DNS server if needed.

As example:

- 127.0.1.53:5353

Warning

We expect a DNS server(s). If no DNS server(s) is given. You’ll almost for certain get all results as INACTIVE

This could happens in case you use --dns -f

filter

Type: string

Default value: ""

Description: Set the element to filter.

Note

This index should be initiated with a regular expression.

generate_complements

Type: boolean

Default value: False

Description: Enable / disable the generation and test of the complements.

Note

A complement is for example example.org if www.example.org is given and vice-versa.

generate_hosts

Type: boolean

Default value: True

Description: Enable / disable the generation of the hosts file(s).

generate_json

Type: boolean

Default value: False

Description: Enable / disable the generation of the JSON file(s).

header_printed

Type: boolean

Default value: False

Description: Say to the system if the header has been already printed or not.

Warning

Do not touch this index unless you have a good reason to.

hierarchical_sorting

Type: boolean

Default value: False

Description: Say to the system if we have to sort the list and the outputs in a hierarchical order.

iana_whois_server

Type: string

Default value: whois.iana.org

Description: Set the server to call to get the whois referer of a given element.

Note

This index is only used when generating the iana-domains-db.json file.

Warning

Do not touch this index unless you a have good reason to.

idna_conversion

Type: boolean

Default value: False

Description: Tell the system to convert all domains to IDNA before testing.

Note

We use domain2idna for the conversion.

inactive_database

Type: boolean

Default value: True

Description: Enable / Disable the usage of a database to store the INACTIVE and INVALID element to retest overtime.

less

Type: boolean

Default value: True

Description: Enable / Disable the output of every information of screen.

local

Type: boolean

Default value: False

Description: Enable / Disable the execution of the test(s) in a local or private network.

logs

Type: boolean

Default value: True

Description: Enable / Disable the output of all logs.

maximal_processes

Type: integer

Default value: 25

Description: Set the number of maximal simultaneous processes to use/create/run.

Warning

If you omit the --processes argument, we overwrite the default with the number of available CPU.

mining

Type: boolean

Default value: True

Description: Enable / Disable the mining subsystem.

multiprocess

Type: boolean

Default value: False

Description: Enable / Disable the usage of multiple processes instead of the default single process.

multiprocess_merging_mode

Type: string

Default value: end

Available values: end, live

Description: Set the multiprocess merging mode.

Note

With the end value, the merging of cross process data is made at the very end of the current instance.

Note

With the live value, the merging of cross process data is made after the processing of the maximal number of process.

Which means that if you allow 5 processes, we will run 5 tests, merge, run 5 tests, merge and so on until the end.

no_files

Type: boolean

Default value: False

Description: Enable / Disable the generation of any file(s).

no_special

Type: boolean

Default value: False

Description: Enable / Disable the usage of the SPECIAL rules - which are discribes in the source column section.

no_whois

Type: boolean

Default value: False

Description: Enable / Disable the usage of whois in the tests.

plain_list_domain

Type: boolean

Default value: False

Description: Enable / Disable the generation of the plain list of elements sorted by statuses.

Warning

Do not touch this index unless you a have good reason to.

quiet

Type: boolean

Default value: False

Description: Enable / Disable the generation of output on the screen.

referer

Type: string

Default value: ""

Description: Set the referer of the element that is currently under test.

Warning

Do not touch this index unless you a have good reason to.

reputation

Type: boolean

Default value: False

Description: Enable / disable the reputation (only) testing.

Warning

If this index is set to True, we ONLY check for reputation, not availability nor syntax.

shadow_file

Type: boolean

Default value: False

Description: Enable / Disable the usage and generation of a shadow file before a the test of a file.

Note

The shadow file, will just contain the actual list of subjects to test.

share_logs

Type: boolean

Default value: True

Description: Enable / disable the logs sharing.

Note

This index has no effect if logs is set to False.

show_execution_time

Type: boolean

Default value: False

Description: Enable / disable the output of the execution time.

show_percentage

Type: boolean

Default value: True

Description: Enable / disable the output of the percentage of each status.

simple

Type: boolean

Default value: False

Description: Enable / disable the simple output mode.

Note

If this index is set to True, the system will only return the result inf format: tested.element STATUS.

split

Type: boolean

Default value: True

Description: Enable / disable the split of the results files.

Note

Understand with “results files” the mirror of what is shown on screen.

store_whois_record

Type: boolean

Default value: False

Description: Enable / disable the storage of the WHOIS record into the WHOIS DB.

Warning

This does not disable the WHOIS DB functionality. It just not storing the full WHOIS reply in the database.

syntax

Type: boolean

Default value: False

Description: Enable / disable the syntax (only) testing.

Warning

If this index is set to True, we ONLY check for syntax, not availability nor reputation.

timeout

Type: integer

Default value: 5

Description: Set the timeout to apply everytime it’s possible to set one.

ci

Type: boolean

Default value: False

Description: Enable / disable the CI autosaving system.

Warning

Do not activate this index unless you are using PyFunceble under a supported CI environment/platform.

ci_autosave_commit

Type: string

Default value: "PyFunceble - AutoSave"

Description: Set the default commit message we want to use when have to commit (save) but our tests are not yet completed.

ci_autosave_final_commit

Type: string

Default value: "PyFunceble - Results"

Description: Set the default final commit message we want to use when we all tests are finished.

ci_autosave_minutes

Type: integer

Default value: 15

Description: Set the minimum of minutes we have to run before to automatically save our test results.

Note

As many services are setting a rate limit per IP, it’s a good idea to set this value between 1 and 15 minutes.

ci_distribution_branch

Type: string

Default value: master

Description: Set the git branch where we are going to push our results.

Note

The difference between this and ci_branch is the fact that this branch will get the result only when the test were finished under the given ci_branch.

As example, this allow us to have 2 branches:

  • proceessing (ci branch), for the tests with PyFunceble.
  • master (ci distribution branch), for the distribution of the results of PyFunceble.

ci_branch

Type: string

Default value: master

Description: Set the git branch where we are going to push our results.

unified

Type: boolean

Default value: False

Description: Enable / Disable the generation of the unified results.

Note

This index has no effect if split is set to True.

use_reputation_data

Type: boolean

Default value: False

Description: Enable / Disable the usage of reputation data while testing the availability of a given subject.

Warning

This only have an effect when used along with the availability test.

verify_ssl_certificate

Type: boolean

Default value: False

Description: Enable / Disable the verification of the SSL/TLS certificate when testing for URL.

Warning

If you set this index to True, you may get false positive result.

Indeed if the certificate is not registered to the CA or is simply invalid and the domain is still alive, you will always get INACTIVE as output.

whois_database

Type: boolean

Default value: True

Description: Enable / Disable the usage of the whois database to avoid/bypass whois server requests rate limit.

wildcard

Type: boolean

Default value: False

Description: Enable / Disable the test of wildcards when testing for syntax.

Warning

This is not taken into consideration if syntax is set to False.

user_agent

Type: dict

Description: Configures the user agent.

user_agent[browser]

Type: string

Default value: chrome

Description: Sets the browser to get the get the latest user agent from. Available values: chrome, edge, firefox, ie, opera, safari

Warning

This option is not taken in consideration if user_agent[custom] is not set to null.

user_agent[platform]

Type: string

Default value: linux

Description: Sets the platform to get the get the latest user agent for. Available values: linux, macosx, win10

Warning

This option is not taken in consideration if user_agent[custom] is not set to null.

user_agent[custom]

Type: string

Default value: null

Description: Sets the user agent to use.

Warning

Setting this index will overwrite the choices made into user_agent[platform] and user_agent[browser].

outputs

Type: dict

Description: Set the needed output tree/names.

Warning

If you choose to change anything please consider deleting our output/ directory and the dir_structure*.json files.

outputs[default_files]

Type: dict

Description: Set the default name of some important files.

outputs[default_files][dir_structure]

Type: string

Default value: dir_structure.json

Description: Set the default filename of the file which has the structure to re-construct.

Note

This index has no influence with dir_structure_production.json

outputs[default_files][iana]

Type: string

Default value: iana-domains-db.json

Description: Set the default filename of the file which has the formatted copy of the IANA root zone database.

outputs[default_files][inactive_db]

Type: string

Default value: inactive_db.json

Description: Set the default filename of the file which will save the list of elements to retest overtime.

outputs[default_files][results]

Type: string

Default value: results.txt

Description: Set the default filename of the file which will save the formatted copy of the public suffix database.

outputs[default_files][public_suffix]

Type: string

Default value: public-suffix.json

Description: Set the default filename of the file which will save the mirror of what is shown on screen.

outputs[default_files][mining]

Type: string

Default value: mining.json

Description: Set the default filename of the file which will save the temporary list of mined subject to test.

outputs[default_files][whois_db]

Type: string

Default value: whois_db.json

Description: Set the default filename of the file which will save the whois information for caching.

outputs[domains]

Type: dict

Description: Set the default name of some important files related to the plain_list_domain index.

outputs[domains][directory]

Type: string

Default value: domains/

Description: Set the default directory where we have to save the plain list of elements for each status.

outputs[domains][filename]

Type: string

Default value: list

Description: Set the default filename of the file which will save the plain list of elements.

outputs[hosts]

Type: dict

Description: Set the default name of some important files related to the generate_hosts index.

outputs[hosts][directory]

Type: string

Default value: hosts/

Description: Set the default directory where we have to save the hosts files of the elements for each status.

outputs[hosts][filename]

Type: string

Default value: hosts

Description: Set the default filename of the file which will save the hosts files of the elements.

outputs[json]

Type: dict

Description: Set the default name of some important files related to the generate_json index.

outputs[json][directory]

Type: string

Default value: json

Description: Set the default directory where we have to save the JSON files of the elements for each status.

outputs[json][filename]

Type: string

Default value: dump.json

Description: Set the default filename of the file which will save the JSON files of the elements.

outputs[complements]

Type: dict

Description: Set the default name of some important files/directories related to the generate_complements index.

outputs[complements][directory]

Type: string

Default value: complements

Description: Set the default directory where we have to save the complements related files sorted by status.

outputs[analytic]

Type: dict

Description: Set the default name of some important files and directories related to the generate_hosts index.

outputs[analytic][directories]

Type: dict

Description: Set the default name of some important directories related to the http_codes[active] index.

outputs[analytic][directories][parent]

Type: string

Default value: Analytic/

Description: Set the default directory where we are going to put everything related to the HTTP analytic.

outputs[analytic][directories][potentially_down]

Type: string

Default value: POTENTIALLY_INACTIVE/

Description: Set the default directory where we are going to put all potentially inactive data.

outputs[analytic][directories][potentially_up]

Type: string

Default value: POTENTIALLY_INACTIVE/

Description: Set the default directory where we are going to put all potentially active data.

outputs[analytic][directories][up]

Type: string

Default value: POTENTIALLY_INACTIVE/

Description: Set the default directory where we are going to put all active data.

outputs[analytic][directories][suspicious]

Type: string

Default value: SUSPICIOUS/

Description: Set the default directory where we are going to put all suspicious data.

outputs[analytic][filenames]

Type: dict

Description: Set the default name of some important files related to the http_codes[active] index and the HTTP analytic subsystem.

outputs[analytic][filenames][potentially_down]

Type: string

Default value: down_or_potentially_down

Description: Set the default filename where we are going to put all potentially inactive data.

outputs[analytic][filenames][potentially_up]

Type: string

Default value: potentially_up

Description: Set the default filename where we are going to put all potentially active data.

outputs[analytic][filenames][up]

Type: string

Default value: active_and_merged_in_results

Description: Set the default filename where we are going to put all active data.

outputs[analytic][filenames][suspicious]

Type: string

Default value: suspicious_and_merged_in_results

Description: Set the default filename where we are going to put all suspicious data.

outputs[logs]

Type: dict

Description: Set the default name of some important files and directories related to the logs index.

outputs[logs][directories]
Type: dict

Description: Set the default name of some important directories related to the logs index.

outputs[logs][directories][date_format]

Type: string

Default value: date_format/

Description: Set the default directory where we are going to put everything related to the data when the dates are in the wrong format.

outputs[logs][directories][no_referer]

Type: string

Default value: no_referer/

Description: Set the default directory where we are going to put everything related to the data when no referer is found.

outputs[logs][directories][parent]

Type: string

Default value: no_referer/

Description: Set the default directory where we are going to put everything related to the data when no referer is found.

outputs[logs][directories][percentage]

Type: string

Default value: percentage/

Description: Set the default directory where we are going to put everything related to percentages.

outputs[logs][directories][whois]

Type: string

Default value: whois/

Description: Set the default directory where we are going to put everything related to whois data.

Note

This is the location of all files when the debug index is set to True.

outputs[logs][filenames]

Type: dict

Description: Set the default filenames of some important files related to the logs index.

outputs[logs][filenames][auto_continue]

Type: string

Default value: continue.json

Description: Set the default filename where we are going to put the data related to the auto continue subsystem.

Note

This file is allocated if the auto_continue is set to True.

outputs[logs][filenames][execution_time]

Type: string

Default value: execution.log

Description: Set the default filename where we are going to put the data related to the execution time.

Note

This file is allocated if the show_execution_time is set to True.

outputs[logs][filenames][percentage]

Type: string

Default value: percentage.txt

Description: Set the default filename where we are going to put the data related to the percentage.

Note

This file is allocated if the show_percentage is set to True.

outputs[main]

Type: string

Default value: ""

Description: Set the default location where we have to generate the parent_directory directory and its dependencies.

outputs[parent_directory]

Type: string

Default value: output/

Description: Set the directory name of the parent directory which will contain all previously nouned directories.

outputs[splited]

Type: dict

Description: Set the default name of some important files and directory related to the split index.

outputs[splited][directory]

Type: string

Default value: splited/

Description: Set the default directory name where we are going to put the split data.

status

Type: dict

Description: Set the needed, accepted and status name.

status[list]

Type: dict

Description: Set the needed and accepted status name.

Warning

All status should be in lowercase.

status[list][valid]

Type: list

Default value: ["valid","syntax_valid","valid_syntax"]

Description: Set the accepted VALID status.

Note

This status is only shown if the syntax index is activated.

status[list][up]

Type: list

Default value: ["up","active"]

Description: Set the accepted ACTIVE status.

status[list][generic]

Type: list

Default value: ["generic"]

Description: Set the accepted generic status.

Note

This status is the one used to say the system that we have to print the complete information on the screen.

status[list][http_active]

Type: list

Default value: ["http_active"]

Description: Set the accepted status for the outputs[analytic][filenames][up] index.

status[list][down]

Type: list

Default value: ["down","inactive", "error"]

Description: Set the accepted status INACTIVE index.

status[list][invalid]

Type: list

Default value: ["ouch","invalid"]

Description: Set the accepted status INVALID index.

status[list][potentially_down]

Type: list

Default value: ["potentially_down", "potentially_inactive"]

Description: Set the accepted status for the outputs[analytic][filenames][potentially_down] index.

status[list][potentially_up]

Type: list

Default value: ["potentially_up", "potentially_active"]

Description: Set the accepted status for the outputs[analytic][filenames][potentially_up] index.

status[list][suspicious]

Type: list

Default value: ["strange", "hum", "suspicious"]

Description: Set the accepted status for the outputs[analytic][filenames][suspicious] index.

status[official]

Type: dict

Description: Set the official status name.

Note

Those status are the ones that are printed on the screen.

Warning

After any changes here please delete dir_structure.json and the output/ directory.

status[official][up]

Type: string

Default value: ACTIVE

Description: Set the returned status for the ACTIVE case.

status[official][down]

Type: string

Default value: INACTIVE

Description: Set the returned status for the INACTIVE case.

status[official][invalid]

Type: string

Default value: INVALID

Description: Set the returned status for the INVALID case.

status[official][valid]

Type: string

Default value: VALID

Description: Set the returned status for the VALID case.

Note

This status is only shown if the syntax index is activated.

http_codes

Type: dict

Description: Handle the interpretation of each status codes when we do and generate our analytic data.

http_codes[active]

Type: boolean

Default value: True

Description: Enable / Disable the usage of the HTTP status code extraction.

http_codes[list]

Type: dict

Description: Categorize the http status code as mentioned in the documentation related to the HTTP Code column.

http_codes[list][up]

Type: list

Default value:
- 100
- 101
- 200
- 201
- 202
- 203
- 204
- 205
- 206

Description: List the HTTP status codes which are considered as ACTIVE.

http_codes[list][potentially_down]

Type: list

Default value:
- 400
- 402
- 403
- 404
- 409
- 410
- 412
- 414
- 415
- 416

Description: List the HTTP status code which are considered as INACTIVE or POTENTIALLY_INACTIVE.

http_codes[list][potentially_up]

Type: list

Default value:
- 000
- 300
- 301
- 302
- 303
- 304
- 305
- 307
- 403
- 405
- 406
- 407
- 408
- 411
- 413
- 417
- 500
- 501
- 502
- 503
- 504
- 505

Description: List the HTTP status code which are considered as ACTIVE or POTENTIALLY_ACTIVE.