Find and Replace

TMG Utility is the essential TMG companion tool. If you use it, please make a donation.

advertisement

On this page: Description, Step by Step, Supported Data Fields, Date Fields, Pattern Matching: Wildcard Characters, Pattern Matching: An Example, Variable Replacement Strings

Description

Find and Replace may be used to substitute one string for another in the fields listed below.

Step by Step

  1. Choose Find and Replace from the function tree.
  2. Choose the field you want to modify.
  3. Key the target string (or a pattern) in the Find what textbox.
  4. Key the replacement string in the Replace with textbox if you want to do a replacement.
  5. For some fields, you may restrict the set of records that are affected using the In: pull down list.
  6. Set the options, and click the [Find All] or [Replace All] button, as desired.

For an explanation of pattern matching, see Microsoft's documentation. It's a lot to read, but it contains far more details than the Pattern Matching: Wildcard Characters section on this page.

Supported Data Fields

For some fields, the program will search multiple columns in the database. In those cases, the program will include a new heading in the log when it begins processing a new column.

Date Fields

Before you modify dates with the Find and Replace feature, you should read and understand the following description of the process and the options. Always use the Log Only option to preview the results before you change your project data.

For the Name and Event date fields, the program supports three subsets of the data for each field. The field choices allow you to operate on all the date values, the regular date values only, or the irregular date values only.

Field Description
Name Date (All) All the date values for all Name tags
Name Date (Regular only) Regular dates only; irregular dates are ignored and not processed or changed
Name Date (Irregular only) Irregular dates only; regular dates are ignored and not processed or changed
Event Date (All) All the date values for all Event tags
Event Date (Regular only) Regular dates only; irregular dates are ignored and not processed or changed
Event Date (Irregular only) Irregular dates only; regular dates are ignored and not processed or changed

When you select any of the Date or Sort Date fields that include regular date values, Find and Replace converts TMG's internal date format to a formatted version of the date. In the formatted version, the modifier words "before", "say", "about", "circa" and "after" are spelled out, followed by the day number, a 3-character month abbreviation, and the 3 or 4 digit year (nn mmm yyyy). Construct your "Find what" value accordingly.

When you select any of the Date or Sort Date fields, the Convert to Regular Date checkbox is enabled. The default value is checked. When checked, and when the [Replace All] button has been pressed, Find and Replace will attempt to transform the new value for the field into a regular date. So, you can use the Convert to Regular Date option to transform irregular dates to regular dates.

If Find and Replace cannot convert a value to a regular date, it will store the value as an irregular date. This is true even when the original value of the date field was a regular date.

Find and Replace will not convert all versions of legitimate date values to regular dates. It recognizes the GEDCOM 5.5 standard format (nn mmm yyyy) as well as dates preceded by these modifiers: before, bef, after, aft, say, about, abt, circa, ca, after, aft. It also recognizes partial dates like "mmm yyyy" and "yyyy".

When Find and Replace successfully converts the new value for a date field to a regular date, it adds the notation "(R)" to the entry in the log file. If the result of a modification is an irregular date, the log entry is followed by "(I)".

If you modify a regular date, and you do not select the Convert to Regular Date option, you will transform regular dates to irregular dates. That is usually a bad outcome and should be avoided.

Find and Replace will not modify regular dates that contain date ranges. Date range dates are "from-to" dates, "between" dates, and "date or date" dates.

When you select any of the Date fields, as opposed to Sort Date fields, the Copy to Sort Date checkbox is enabled. The default value is checked. When checked, Find and Replace will copy the result of the replace operation to the tag's Sort Date field as well as the tag's Date field.

Pattern Matching: Wildcard Characters

When the Use Pattern Matching option is checked, you may use a "regular expression" as the target of the search. Regular expressions are a powerful matching facility that allow you to search for patterns of text rather than for literal text values. If you are not familiar with regular expressions, you may want to consider them as a means to add wildcard characters to the target.

Alphabetic and numeric characters, ABC, 123, etc., can be used in regular expressions. Those characters are literals and match the given character when it appears in the text being searched. The power of regular expressions becomes evident when you mix literal characters and one or more of the symbols described in the table below. Some of the symbols are characters that do not appear on the keyboard.

Symbol Use
\n Matches a new line
\f Matches a form feed
\r Matches carriage return
\t Matches horizontal tab
. Matches any character except a newline character (\n). "ca." matches "cat" and "car", etc.
.* Matches one or more characters in sequence except a newline character (\n). "p.*ent" matches "parent" and "patent", and "potent" as well as "part of the tent", etc.
^ Matches the beginning of the field being searched. "^Brian " would only match the word "Brian" when it appears as the first word in a field.
$ Matches the end of the field being searched. " Brian$" would only match the word "Brian" when it appears as the last word in a field.

There are more symbols than those shown above. They are explained in the Microsoft's documentation. You don't have to know how to use all the symbols, but if you are using patterns, you have to know how to avoid using a character as a symbol when you want to match the literal character. See the table below for characters that you have to precede with "\" to match their literal value.

Symbol Matches
\? ?
\* *
\+ +
\. .
\| |
\{ {
\} }
\\ \
\[ [
\] ]
\( (
\) )

Use may use the following symbols in the "Replace with" textbox.

Symbol Use
\n Inserts a new line character
\r Inserts a carriage return character
\t Inserts a tab character
$1, $2, etc. These symbols are explained below in the Variable Replacement Strings section. Note that "$" has an entirely different meaning when used in the "Find what" textbox.

Pattern Matching: An Example

Pattern matching using regular expressions is a powerful capability. That power comes at a price however: complexity. Still, it is possible to make effective use of pattern matching with only a subset of the language.

Please note that the following example is based on changing the contents of the surname field. The Find and Replace feature does not support that field as of version 5.0. The regular expression rules have not changed.

My ancestors include a family with the surname Millet, variously spelled Millet, Millett, Millit, Millitt, Milit, etc. It is relatively easy to write a regular expression to find all the variations: "i" versus "y", single "L" versus double "L", single "T" versus double "T". Here's how.

Let's start with a simple expression:
Mill[ei]tt

The expression consists mostly of literal characters that must match the string being searched exactly. The [ei] is a "character class" expression. Character-class expressions consist of a sequence of characters in brackets. Unlike literal matches, the program does not attempt to find all of the characters in the sequence. Instead, it trys to find one of the characters.

Unfortunately, the expression Mill[ei]tt will only find two variations of the name: Millett and Millitt.

Let's add a repetition expression to control how many Ls and how many Ts will be considered a match. The repetition expression takes the form {minimum,maximum}.

The new expression is now:
Mil{1,2}[ei]t{1,2}

The expression will now find all the combination of one and two Ls, one and two Ts, and either an I or an E as the last vowel. It will now match all the examples given above, and then some.

Here's another example:
Faire{0,1}banke{0,1}s{0,1}

This will find many of the variations of the name Fairbanks, including Fairbank, Fairbanks, Fairebank, Fairebanke, Fairbanke, Fairebankes, etc. The two Es and the trailing S may appear zero or one times.

These expressions may look confusing, but they only use three concepts:

  1. literal matches
  2. character-class matches
  3. repetition controls

Other expressions are available, but with these three you can do a lot.

If you need to match one of the special characters like [ or {, use a backslash (\) as an escape character. The following expression will find the name smith in brackets:
\[smith\]

Good luck. Read Microsoft's documentation for more details.

Variable Replacement Strings

Sometimes you want to change characters that occur only within some other variable characters. For example, let's assume that you want to change "number & number" to "number , number" where "number" is any number. This is a real example from some citation detail strings that I wanted to change. Some of my citations where like this:

16, 18
17, 24

But some were like this:

59 & 77
111 & 143

I wanted them to be consistent, and I wanted to use the first format.

It's pretty simple to write a search expression to find two numbers separated by space-ampersand-space:
[0-9]+ & [0-9]+

The [0-9] part means "a digit", and the plus sign that follows that means "one or more of those characters." Together, [0-9]+ means "one or more digit characters."

OK, I can find number-ampersand-number in the citation details field. But to change it, I have to retain the number parts and discard the ampersand part. How do I do that?

It turns out that the "Replace with" string can contain a special construct to refer to elements of the string being changed. You add some special characters to the search string to split the target text into parts. Basically, you add parentheses around the parts of the expression that represent characters you want to copy to the replacement string. In our case, we need them around the two numbers:
([0-9]+) & ([0-9]+)

That's only half the battle. Now we need to specify the replacement string. To refer to the two numbers, we use $1 and $2. For each part you designate with ( and ), you can reference the actual value in the replacement string using $ and the sequence number of the ( within the target.

So, our replacement string is:
$1, $2

You can use this facility to re-order character strings. In the example, if our replacement string was
$2, $1

... the numbers would be reversed.

Notes

  1. The Find and Replace feature relies on a regular expression facility provided by Microsoft. It appears that there may be two different versions in circulation: one that uses parentheses to identify parts of the expression, and one that uses \( and \). If the method described above doesn't work, try using backslashes before the opening and closing parentheses.