Password Definition Language

Password definition language (PDL) is designed for flexible and powerful control of password generation. It allows password definition creation using all known information about a password which substantially reduce the time of password searching. The typical example of such information is : "I remember that my password consist of two words separated by the one of the signs "-", "_", "=", "," and the first letters of the words are in the uppercase" (see example 1 to see how it looks in PDL).

The main idea of PDL language is component-based description of password. The above example has three components - first word, separator (symbol) and the second word. In most cases is it possible to divide the password verbal description into components.

The syntax of PDL language is close to the regular expressions syntax. Your knowledge of regular expressions will help you understand PDL language easily.

There are three main components type in PDL language - charsets, words and generators.

Password definition file (PDF-file) is just text file in NEW UTF-16 or UTF-8 format and consists of two parts: firstly, dictionary and character set definition, and secondly, passwords definition.

4.2. Passwords definition

This is the main part of the file. IT NECESSARILY EXISTS IN ANY PASSWORD DEFINITION FILE (PASSWORD.DEF) AFTER THE LINE '##' and presets password generation rules to be checked out later on. It consists of the text lines, each of them gives its own password set and mode of operation, i.e. an algorithm of password search. Each line is independent and is processed separately, herewith the total number of checked passwords is counted up.

4.2.1. Character sets

Character set (charset) is a range of characters, any of them can occupy current position in a password. The supported charsets are:

1) Simple single characters (a, b, etc.). It means this particular character occupies given position in a password;

2) Shielded characters. If any of special characters can ever occur in the password, it must be shielded with '\'. The meaning is identical with item 1 mentioned above. Among these characters are:

\$, \., \*, \?, \='$', '.', '*', '?', '='
\], \[, \{, \}, \(, \)corresponding brackets;
\ (space character)space character
\XXXXany Unicode character in hex (X is a hexadecimal digit), like \D9E9
\0no character. It is usually used in conjunction with "real" character (please find examples below).

Generally, any character can be shielded except hexadecimal digits.


3) Macros of character set. It means that current position in the password can be occupied by any character from the set. These sets are specified in the first part of password definition file (see section 4.3.2) and are denoted as:

$alower-case Latin letters (26 letters, unless otherwise specified)
$Aupper-case Latin letters (26 letters, unless otherwise specified)
$!special characters (32 characters, unless otherwise specified)
$1digits (10 digits, unless otherwise specified)
$ilower-case letters of national alphabet
$Iupper-case letters of national alphabet
$oother user-specified characters
?any character (i.e. all the characters, included into the macros mentioned above)

NOTE: macros $v and $p (see section 4.3.4) cannot be used for password definition.

4) Any combinations of the characters mentioned above. It must be written in square brackets. The meaning is identical with item 3 mentioned above. For example:

[$a $A]any Latin letter
[abc]a, or b, or c
[$1 abcdef]hexadecimal digit
[s \0]s or nothing
[$a $A $1 $! $i $I $o] this is equivalent to ?

5) Regular repetition character *.It means the previous character set has to be repeated in related item of the password 0 or more times. For example:

$a * a password of any length, consisting of lower-case Latin letters
[ab] * c c, ac, bc, aac, abc, bac, bbc, aaac, ...
[$a $A] [$a $A $1] * "identifier", i.e. a sequence of letters and digits with a letter at first position

Note that password of zero length (null password) is physically meaningful and is not always the same as no password at all.

The length of repetition is computed in two ways:

It is recommended to use * as wide as possible. This is because it allows to perform the most powerful search. Although the constructions ? * and ? ? * seem to be alike from the logic standpoint, the first one will be searched through faster.

NEWCurrent limitation: * can be used only once in each line.

4.2.2. Dictionary words and their modifiers

Contrary to the character set, the words present several consecutive passwords characters.  Two dictionaries are supported by PDL library: main (with ordinary words mainly) and user’s (where special information can be stored, for example, proper names, dates, etc.), though there is no difference between them.

Dictionary is a text file in NEWboth UTF-16 or UTF-8 encoding (Windows), consisting of the words, separated by the end-of-line characters. Both DOS-format (CR/LF) and UNIX-format (LF) files can be used. Preferably to use words of the same (lower) case in dictionaries (to increase search rate, among other factors). To sort out words by its length is also desirable.

Thus, there are two macros:

$wa word from the main dictionary
$ua word from the user dictionary

As is known altered words are often used as passwords. So a whole set of word modifiers is introduced to determine such passwords. Among these are:

.u (upper)to upper-case
.l (lower)to lower-case
.t (truncate)to truncate up to the given length
.c (convert)to convert the word
.j (joke)to upper-case some letters
.r (reverse)to reverse the word
.s(shrink)to shrink the word
.p (duplicate)to duplicate the word

Modifiers may have parameters, written in round brackets. For modifiers, meant for single letters use, it is possible to preset a number of the letter as a parameter. Lack of parameters or null parameter means "the whole word".Then, letters can be numerated from both the beginning of the word and the end of the word. The end of the word is denoted with the character '-' .

There are only three such modifiers for today: .u, .l, .t. So, use:

.u or .u(0) to upper-case the whole word (PASSWORD)
.u(1), .u(2)to upper-case only the first (the second) letter (Password, pAssword)
.u(-), .u(-1)to upper-case the last (the next to last) letter (passworD, passwoRd)
.t(-1)to truncate the last letter in the word (passwor)

The other modifiers operate with the whole words only and their parameters give the way of modification. The following modifier parameters are specified for today:

.j(0) or .j to upper-case odd letters (PaSsWoRd)
.j(1)to upper-case even letters (pAsSwOrD)
.j(2)to upper-case vowels (pAsswOrd)
.j(3)to upper-case consonants (PaSSWoRD)
.r(0) or .rto reverse the word (drowssap)
.s(0) or .sto reduce the word by discarding vowels unless the first one is a vowel
(password -> psswrd, offset -> offst)
.p(0) or .pto duplicate the word (passwordpassword)
.p(1)to add reversed word (passworddrowssap)
.c(<number>)to convert all the letters in the word according to the appropriate conversion string (see section 4.3.3)

All the modifiers operate adequately with both Latin and national letters, provided that the rules of national character sets definition are observed. Clearly there can be more than one modifier (the number of consecutive modifiers is limited by 63, which is unlikely to be exceeded). For example: (let $w mean a "password"):

$w.u(1).u(-) PassworD
$w.s.t(4) pssw
$w.t(4).s pss

4.2.3.Generators

Generator is the last possible component type and it generates several passwords of different length from the given symbols in current position. Generators are denoted by { and } symbols followed by generator type, like {abc}.u. The opening bracket indicates the position of beginning the generator, and the closing one - the ending position.

1) The default generator - permutation brackets { }

This generator has no explicit type, i.e. it simply denoted by { and }

The problem is widely met, when you remember your password, but it is not valid for some reason. Probably, you mistype it. The PDL engine has its own algorithm to restore such passwords. The following typing mistakes are considered: two neighboring letters are swapped (psasword), a letter is omitted (pasword), an needless letter is inserted (passweord) or one letter is replaced with another (passwird). Such password changes will be referred to as permutations.

To indicate the beginning and the end of the password section where permutations could appear, permutation brackets '{' and '}' are used. The bracket '}' can be followed by a number of permutations (1 by default), separated by a dot. The physical meaning of the number of permutations is the number of simultaneously introduced mistakes. For example:

{abc} - 182 (different) passwords will be obtained, including:

bac, acb2 swaps
bc, ac, bc3 omissions
aabc, babc ...4 * 26 - 3 insertions
bbc, cbc ...3 * 25 replacements
abcthe desired word

{password}.2 - the following words will be generated: psswrod, passwdro, paasswor, etc.;

{$w} - all the words, containing one mistake, from the main dictionary.

Notes:

a) It is obvious that some passwords will be obtained more than once, so the larger is the number of permutations, the larger is the number of replicas. Efforts were made in this program to reduce replicas, but they are purely empirical and were made for two permutations at most. In other words, for the large numbers there is no certainty that a particular password cannot be discarded erroneously.

b) For insertion and replacement one should know the set of characters to be inserted or replaced. In the event this set is not specified explicitly (see section 4.3.4), this program forms it automatically for character sets, in relation to standard set these characters are from (i.e. for {password}$a will be inserted, for {Password} [$a $A] will be inserted). The similar operation with words is performed, based on the first word from the dictionary with modifiers being taken into account. In the event this set is specified explicitly, it is just the set to be used.

2) Upper and lowercase generators {}.u, {}.l

These generators will generate all possible upper or lower-case combinations from the given symbols, like:

{abc}.uabc, Abc, aBc, abC, ABc, AbC, aBC, ABC
{AbC}.lAbC, abC, Abc, abc
{$w}.uAll main dictionary words with all possible upper-case conversion

3) Conversion generator {}.c

Like the above generator, this one is used to generate all possible combination of word using the appropriate convert table  (see section 4.3.3) written in round brackets.

Let .c(0) convert table defines conversion of letters to similar symbols, then

{password}.c(0)password, pa$sword, pas$word, passw0rd, pa$$word, pa$sw0rd,  pas$w0rd, pa$$w0rd

Of course, there may be several generators in the password description. So, the definition like

{my{good}password}.u

may occur.