The Noobs of Python: Ep.2.3 - String Basics

Strings can be used to represent just about anything that can be encoded as text or bytes. contents of text files loaded into memory, Internet addresses,Python source code, and so on. Strings can also be used to hold the raw bytes used for media files and network transfers, and both the encoded and decoded forms of non-ASCII Unicode text used in internationalized programs. Python strings are categorized as immutable sequences, meaning that the characters they contain have a left-to-right positional order and that they cannot be changed in place.

Here are some examples of strings in Python:
Empty strings L1T = ''
Double quotes, same as single L1T = "spam's"
Escape sequences L1T = 'Lev\el\1\Tech'
Triple-quoted block strings L1T = """...multiline..."""
Raw strings (no escapes) L1T = r'\usr\bin\Python-3.5'
Byte strings in 2.6, 2.7, and 3.X L1T = b'sp\xc4m'
Concatenate, repeat L1 + L2

and many more variety of ways to code strings for processing, strings support expression operations such slicing (extracting sections), indexing (fetching by offset), which we will eventually cover in other EP's. Beyond the core set of string tools above, Python also supports more advanced pattern-based string processing with the standard library’s re (for “regular expression”) module and even higher-level text processing tools such as XML parsers which I hope to discuss in later Ep's. This Ep's is about fundamentals.

Q: So I see the examples you posted above but I'm a Noob so can you explain?
A: Here we go !

string literals can be written enclosed in either two single or two double quotes the two forms work the same and return the same type of object. The reason for supporting both is that it allows you to embed a quote character of the other variety inside a string without escaping it with a backslash. You can use Idle 3 to practice:

>>> 'Level1Tech', "Level1Tech"
('Level1Tech', 'Level1Tech')

We will generally use single quotes around strings just because they are marginally easier to read, except in cases where a single quote is embedded in the string. This is a purely subjective style choice, but Python displays strings this way too and most Python programmers do the same today, so you probably should too. here is an example :

>>> forum = 'Level' + '1' + 'Tech'
>>> forum
'Level1Tech'
>>>

without the + operator between them Python will invoke concatenation explicitly. Adding commas between these strings would result in a tuple, not a string. Also notice in all of these outputs that Python prints strings in single quotes unless they embed one. If needed, you can also embed quote characters by escaping them with backslashes like in the example below:

>>> 'The Knight\'s of Python'
"The Knight's of Python"
>>>

Which leads us into Escape sequences. Backslashes are used to introduce special character codings known as escape sequences, which let us embed characters in strings that cannot easily be typed on a
keyboard. The character \ , and one or more characters following it in the string literal, are replaced with a single character in the resulting string object, which has the binary value specified by the escape sequence. See the example below:

>>> L1t = 'Level\n1\tTech'
>>> print(L1t)
Level
1	Tech
>>>

The two characters \n stand for a single character—the binary value of the newline character in your character set, Similarly, the sequence \t is replaced with the tab character. To be completely sure how many actual characters are in this string, use the built-inlen function it returns the actual number of characters in a string, regardless of how it is coded or displayed.

>>> len(L1t)
12
>>>

Python recognizes a full set of escape code sequences as shown below:

Escape     Meaning
\             newline Ignored (continuation line)
\\            Backslash (stores one \ )
\'            Single quote (stores ' )
\"            Double quote (stores " )
\a            Bell
\b            Backspace
\f            Formfeed
\n            Newline (linefeed)
\r            Carriage return
\t            Horizontal tab
\v            Vertical tab

Q: Very cool ! I should take a moment to practice these...
A: The next Ep will be a more broader explanation of strings so practice these to understand them better.

Code On Code_Warriors !

2 Likes

Raw Strings Suppress Escapes

Escape sequences are handy for embedding special character codes within strings. Sometimes, the special treatment of backslashes for introducing escapes can lead to trouble. For instance :

>>> L1T = open('C:\new\text.dll', 'w')
>>>

Thinking that you will open a file called text.dll in the directory C:\new, you are actually have a problem here, is that \n is taken to stand for a newline character, and \t is replaced with a tab.
So, the call tries to open a file named C:(newline)ew(tab)ext.dll. This is just the sort of thing that raw strings are useful for. If the letter r (uppercase or lowercase) appears just before the opening quote of a string, it turns off the escape mechanism. The result is that Python retains your backslashes literally, exactly as you type them. Therefore, to fix the filename problem, just remember to add the letter r on
Windows:

>>> L1T = open(r'C:\new\text.dll', 'w')
>>>

Also, because two backslashes are really an escape sequence for one backslash, you can keep your backslashes by simply doubling them up :

>>> 
>>> L1T = open('C:\\new\\text.dll', 'w')
>>>

Triple Quotes Code Multiline Block Strings

So far, you’ve seen single quotes, double quotes, escapes, and raw strings in action.
Python also has a triple-quoted string literal format, sometimes called a block string,
its a multiline text data. This form begins with three quotes (of either the single or double variety), is followed by any number of lines of text, and is closed with the same triple-quote sequence that opened it. Single and double quotes embedded in the string’s text may be, but do not have to be, escaped the string does not end until Python sees three unescaped quotes of the same kind used to start the literal. See the example below:

>>> 
>>> Luke_Skywalker = '''  When nine hundred years
                          old you reach, 
                          look as good you will not.
                          '''
>>> 
>>> Luke_Skywalker
' When nine hundred years\n  old you reach, \n  look as good you will not.\n '
>>> 
>>>

Python collects all the triple-quoted text into a single multiline string, with embedded newline characters \n at the places where your code has line breaks. The triple quotes can also be used to comment out your code onto multiple lines ( preferably ) which is what I personally do. I haven't found many use cases for multi-line strings.

1 Like

Nice! I code mainly in Java, but have been branching out to Python and am really enjoying it.
So, learning about the raw string is something I didn't know about. I suppose I haven't tried to hardcode a path in my code, so I haven't run into that particular problem yet.

Nice tutorial! :)

Enjoy. This is part of a continuing series 'The Noobs of Python' and will escalate to more advanced techniques 'The Knights of Python' and community challenges. Some already posted

Awesome. Will this be added in separate threads then?

@Shadow_Bearer
This is it's own thread, as indicated by the numbering scheme 'The Noobs of Python: Ep.2.3' You can click on the tag noobsofpython to find all the other Ep. Tutorials in the series.

String Operations

There are four binary operators that act on strings: in , not in , + , and * . The first three expect both operands to be strings. The last requires the other operator to be an integer. A one-character substring can be extracted with subscription and a longer substring by slicing. Both use square brackets, as we’ll see below.

The in and not in operators test whether the first string is a substring of the second one (starting at any position). The result is True or False :

>>> 
>>> 'Level1Tech' in 'Level1TechLevel1TechLevel1Tech'
True
>>> 'Level1Tech' not in 'Level1TechLevel1TechLevel1Tech'
False
>>> 'Level' in 'Level1TechLevel1TechLevel1Tech'
True
>>> '1' not in 'Level1TechLevel1TechLevel1Tech'
False
>>>

Subscription is expressed with a pair of square brackets enclosing an integer-valued expression called an index. A subscription selects an item of a sequence (string, tuple or list) or mapping (dictionary) object. The first character is at position 0, not 1 :

>>> 'Level1Tech'[6]
'T'
>>>

The Subscription expression is related to Slicing which has been covered and continued to do so in :
The Noobs of Python: Ep.2.1 - List : Slice, Loops and Operators

String Concatenation and Formatting

Going further in depth with Strings we touch on Concatenations. Concatenation is a big word that means to combine or add two things together. In this case, we want to know how to add two or more strings together. As you might suspect, this operation is very easy in Python as seen in the example below:

>>> L1 = 'Level1Techs'
>>> L1F = 'Forum'
>>> Concat = L1 + L1F
>>> print(Concat)
Level1TechsForum
>>>

Q: That was a very easy example of putting 2 variables together, What else can we do?
A: As you can see from the example above on Operators the + is the addition operator when it operates on two integers or floating-point values if you need to use a variable within a string like so:

>>> 'Level' + '1' + 'Techs'
'Level1Techs'
>>>

Note: If you try to use the + operator on a string and an integer value, Python will not know how to handle this, and it will display an error message as seen below.

    >>>
    >>> 'Level' + 1 + 'Techs'
    Traceback (most recent call last):
      File "<pyshell#12>", line 1, in <module>
        'Level' + 1 + 'Techs'
    TypeError: Can't convert 'int' object to str implicitly
    >>>

Q: Ah cool ! You showed an error message ! I thought you never made mistakes Lol :slight_smile:
A: More on Converting Data Types and Error handling in a later Ep:

Q: Are there other ways to do string concatenation?
A: Absolutely ! Here is another example with a for loop

>>> for name in names:
	print('Hello ' + name)

	
Hello Wendell
Hello Twindell
Hello Gwendell
>>>

More in Loops in The Noobs of Python: Ep.2.1 - List : Slice, Loops and more [Updated]

I've shown you the + operator for some simple concatenation, but the truth is that using the + operator to join strings together is very inefficient and can potentially slow your program’s execution down. Python isn’t that slow. Often, it works out better to manipulate a list of words and then use string.join(sequence) to return a string that is a concatenation of the strings in the sequence.

>>> 
>>> l1 = 'Wendell'
>>> l2 = 'Twindell'
>>> l3 = 'Gwendell'
>>> 
>>> Level2 = ' '
>>> 
>>> Level2.join([l1, l2, l3])
'Wendell Twindell Gwendell'
>>>

Keep in mind that string.join() is expecting a sequence of strings as an argument.

>>> 
>>> l1 = 'Wendell'
>>> l2 = 'Twindell'
>>> l3 = 'Gwendell'
>>> Level2 = ' - '
>>> Level2.join([l1, l2, l3])
'Wendell - Twindell - Gwendell'
>>>

You may need to convert other data types into strings and join up any sublists first, so that you present the outermost join() with a list of strings.

>>> 
>>> ex = 'the Level1Techs forums'
>>> ex2 = 'member'
>>> l1 = ' '
>>> l1.join(('Wendell', l1.join([ex, ex2])))
'Wendell the Level1Techs forums member'
>>>

This is where you need to start keeping an eye out for nested parentheses. Idle3.5 comes in handy here for NoobsOfPython because it highlights what the last 3 parenthesis locks in

>>> l1.join(('Wendell', l1.join([ex, ex2])))

Try this out if you have Idle3.5. It does become confusing at time encapsulating all of these strings and methods together. Post your comments if you have any !

Here is a cleaner way of achiving the same results:

>>> Ex1 = l1.join([ex, ex2])
>>> l1.join(['Wendell', Ex1])
'Wendell the Level1Techs forums member'
>>>

Q: If the plus Operator can be used on the string for Concatenation what about the * Operator?
A: The * Operator is used for replication here is a quick example:

>>> 
>>> ex * 2
'the Level1Techs forumsthe Level1Techs forums'
>>>

Post some of your examples and questions below !

Code_On_Code_Warriors