Python string split and splitlines: Split by spaces or newlines
A Python string is split to a list of strings by the
s = 'book-car-desk' a = s.split('-') print(a) # ['book', 'car', 'desk']
split() returns a Python list. The
s has two delimiters (
-) so is split to the words list. If the original string doesn't contain the delimiter, the list contains only one string.
s = 'book-car-desk' a = s.split('***') print(a) # ['book-car-desk']
book-car-desk doesn't contain
*** so the
split() returns the list that has one element.
Split a string by one space
It's common to split sentences into words in Python programming. A Python string can be split by a space as follows.
s = 'book car desk' a = s.split(' ') print(a) # ['book', 'car', 'desk']
The argument of
split() is one space so the string is split to three strings.
s = 'book car desk' a = s.split() print(a) # ['book', 'car', 'desk']
There is no arguments in the method and it automatically splits by one space. Furthermore, even if there are many spaces between strings, the outcome is the same.
s = ' book car desk' a = s.split() print(a) # ['book', 'car', 'desk']
No argument vs one space delimiter
split() with no argument splits a string by not only spaces but newlines. But if the argument is a one-space delimiter and the original string has newlines, the elements in the outcome list have newlines.
s = """ red blue green """ a = s.split() b = s.split(' ') print(a) # ['red', 'blue', 'green'] print(b) # ['\nred\n', 'blue\n', '', 'green\n']
The area in two of three quotations is a string with multiple newlines and the first letter would be directly after the first triple quotations. In the above example, there is one newline between triple quotations and
red so there is one newline before
red in the outcome list.
Split string to letters
We can split a string to all letters using Python list comprehension.
a = 'Apple' b = [c for c in a] print(b) # ['A', 'p', 'p', 'l', 'e']
c is a local variable iterating the
a and an element of new generated list
Split by newline
splitlines() splits a string by a newline (linefeed) in Python. It is often used to split an article to some paragraphs.
a = """Microsoft Facebook Netflix """ b = a.splitlines() print(b) # ['Microsoft', 'Facebook', 'Netflix']
This method doesn't ignore empty strings so if there are multiple newlines in the original string, empty strings will be in the outcome list.
a = """ Microsoft Facebook Netflix """ b = a.splitlines() print(b) # ['', '', 'Microsoft', '', '', 'Facebook', '', 'Netflix', '']
If you want to remove empty strings, use filter() and set the first argument
a = """ Microsoft Facebook Netflix """ b = a.splitlines() c = filter(None, b) d = list(c) print(d) # ['Microsoft', 'Facebook', 'Netflix']
Empty strings are removed.