By RidingThisToTheTop


2008-11-08 18:25:24 8 Comments

What is the Python equivalent of Perl's chomp function, which removes the last character of a string if it is a newline?

28 comments

@DeepBlue 2019-11-06 20:50:30

First split lines then join them by any separator you like.

  x = ' '.join(x.splitlines())

should work like a charm.

@sim 2019-10-22 07:43:27

s = '''Hello  World \t\n\r\tHi There'''
# import the module string   
import string
# use the method translate to convert 
s.translate({ord(c): None for c in string.whitespace}
>>'HelloWorldHiThere'

With regex

s = '''  Hello  World 
\t\n\r\tHi '''
print(re.sub(r"\s+", "", s), sep='')  # \s matches all white spaces
>HelloWorldHi

Replace \n,\t,\r

s.replace('\n', '').replace('\t','').replace('\r','')
>'  Hello  World Hi '

With regex

s = '''Hello  World \t\n\r\tHi There'''
regex = re.compile(r'[\n\r\t]')
regex.sub("", s)
>'Hello  World Hi There'

with Join

s = '''Hello  World \t\n\r\tHi There'''
' '.join(s.split())
>'Hello  World Hi There'

@Venfah Nazir 2018-06-15 07:24:21


This will work both for windows and linux (bit expensive with re sub if you are looking for only re solution)

import re 
if re.search("(\\r|)\\n$", line):
    line = re.sub("(\\r|)\\n$", "", line)

@wjandrea 2019-01-27 22:39:26

Why use re.search where you just need re.sub?

@Alien Life Form 2015-10-01 08:33:32

This would replicate exactly perl's chomp (minus behavior on arrays) for "\n" line terminator:

def chomp(x):
    if x.endswith("\r\n"): return x[:-2]
    if x.endswith("\n") or x.endswith("\r"): return x[:-1]
    return x

(Note: it does not modify string 'in place'; it does not strip extra trailing whitespace; takes \r\n in account)

@Taylor Edmiston 2017-07-27 05:24:28

I'm bubbling up my regular expression based answer from one I posted earlier in the comments of another answer. I think using re is a clearer more explicit solution to this problem than str.rstrip.

>>> import re

If you want to remove one or more trailing newline chars:

>>> re.sub(r'[\n\r]+$', '', '\nx\r\n')
'\nx'

If you want to remove newline chars everywhere (not just trailing):

>>> re.sub(r'[\n\r]+', '', '\nx\r\n')
'x'

If you want to remove only 1-2 trailing newline chars (i.e., \r, \n, \r\n, \n\r, \r\r, \n\n)

>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r\n')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n')
'\nx'

I have a feeling what most people really want here, is to remove just one occurrence of a trailing newline character, either \r\n or \n and nothing more.

>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n\n', count=1)
'\nx\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n\r\n', count=1)
'\nx\r\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n', count=1)
'\nx'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n', count=1)
'\nx'

(The ?: is to create a non-capturing group.)

(By the way this is not what '...'.rstrip('\n', '').rstrip('\r', '') does which may not be clear to others stumbling upon this thread. str.rstrip strips as many of the trailing characters as possible, so a string like foo\n\n\n would result in a false positive of foo whereas you may have wanted to preserve the other newlines after stripping a single trailing one.)

@ShadowRanger 2017-08-11 15:26:40

You could skip the non-capturing group, even for your final approach, with the regex r'\r?\n$'. Likely more efficient, since regex engines have a harder time optimizing alternations. Also note that if you're going to do this many times, it will be significantly faster (especially if you're intermingling with other re uses) to re.compile the expression once up front, then use the sub method of the compiled regex object; module functions are Python level and check a cache for compiled regexes first (creating/caching if missing), then call the matching method; skipping that lookup helps.

@ShadowRanger 2017-08-11 15:40:21

Also, side-note: Since you're trying to match the \n directly, you might want to use \Z over $ (or just match \r?$, since $ implicitly can match just before the newline at the end of a string).

@kuzzooroo 2015-01-11 18:47:33

I find it convenient to have be able to get the chomped lines via in iterator, parallel to the way you can get the un-chomped lines from a file object. You can do so with the following code:

def chomped_lines(it):
    return map(operator.methodcaller('rstrip', '\r\n'), it)

Sample usage:

with open("file.txt") as infile:
    for line in chomped_lines(infile):
        process(line)

@ShadowRanger 2017-08-11 15:34:15

Note: With operator.methodcaller and map (itertools.imap on Py2) you can push this work to the C layer, avoiding Python level generator code (and thereby running a bit faster, though admittedly I/O overhead is likely to mask small gains): for line in map(operator.methodcaller('rstrip', '\r\n'), infile):. It could be still be factored out as def chomped_lines(it): return map(operator.methodcaller('rstrip', '\r\n'), it).

@Rich Bradshaw 2008-11-08 18:31:53

Try the method rstrip() (see doc Python 2 and Python 3)

>>> 'test string\n'.rstrip()
'test string'

Python's rstrip() method strips all kinds of trailing whitespace by default, not just one newline as Perl does with chomp.

>>> 'test string \n \r\n\n\r \n\n'.rstrip()
'test string'

To strip only newlines:

>>> 'test string \n \r\n\n\r \n\n'.rstrip('\n')
'test string \n \r\n\n\r '

There are also the methods lstrip() and strip():

>>> s = "   \n\r\n  \n  abc   def \n\r\n  \n  "
>>> s.strip()
'abc   def'
>>> s.lstrip()
'abc   def \n\r\n  \n  '
>>> s.rstrip()
'   \n\r\n  \n  abc   def'

@brian d foy 2008-11-08 21:04:16

I'm not a Python person so I don't have the answer to this, but Perl's chomp() actually removes the input record separator from the end. That's a newline on Unixy things, but may be different (e.g. Windows) and it's mutable. Is there a way to remove that value only once from the end of a string?

@Peter Hosey 2008-11-09 06:13:48

brian d foy: Python doesn't have an input record separator like awk and Perl have.

@Andrew Grimm 2009-07-03 01:43:48

Is \n sufficient? >>> "test string\r\n".rstrip("\n") 'test string\r'

@dom0 2012-05-15 16:12:38

\r itself won't trigger a newline anywhere (except for OSX, but who cares about OSX?)

@skue 2012-11-04 19:03:29

@csde_rats, that's not true: OS X uses \n for newlines just like Unix. (Prior to OS X, MacOS did use \r as a line separator, but that ended 10 years ago.)

@AlcubierreDrive 2012-11-07 08:11:24

@briandfoy Python has built-in support for Universal newlines (only when reading, not when writing). You open the file in either "U" or "rU" mode, and then regardless of Windows, Linux, Mac, whatever, by the time the text reaches your python code, any style of newline has been replaced with "\n". See: python.org/dev/peps/pep-0278

@bitek 2013-02-05 20:58:43

This only removes the first \n not all the \n's. What if I have a string as: "sadsa\nsadas\nsadsad\n" How can I remove ALL the newlines? replace('\n', '') doesn't work either.

@yegle 2013-02-28 15:39:09

Just in case someone misunderstand the example: the argument for strip, rstrip, lstrip is used as a list of characters, not a string. So 'abcdedcba'.strip("abc") gives def, not cdedcba!

@Alix Axel 2013-05-13 16:55:35

@skue: What about Windows? If you remove the \n you still get \r.

@fbicknel 2014-10-02 17:40:35

@yegle, I think you mean 'ded', not 'def'. :)

@Colin D Bennett 2015-02-03 18:16:28

@AlixAxel As stated above by @AlcubierreDrive, handling newlines in a portable way means they are converted to the canonical \n form when they are read from the file. Then, your program working with strings will never see the \r characters.

@sethmlarson 2016-03-07 20:37:33

I know this isn't really a question but is brought up in the comments. Is there any reason why text.rstrip(os.linesep) would not work in an OS-independent way to achieve this?

@Script Kitty 2016-04-12 01:06:12

I'm going to go ahead and spell this out because I'm a noob and I spent a while wondering why it wasn't working. .strip() does not alter the string (probably has something to do with immutable strings). If not in command line, you'll want "string = string.strip()"

@Robert Sim 2017-09-28 17:40:20

rstrip() on its own is going to give you a lot of headaches if you're processing TSVs with some empty columns. For example "foo\tbar\t\t\n".rstrip() will remove the last two empty columns from your data.

@Alex Hall 2019-12-03 11:49:02

Necessary to get clean string variables passed via argsparse. Windows newlines made them not even show up at all in string concatenation.

@user3780389 2017-04-26 17:58:16

It looks like there is not a perfect analog for perl's chomp. In particular, rstrip cannot handle multi-character newline delimiters like \r\n. However, splitlines does as pointed out here. Following my answer on a different question, you can combine join and splitlines to remove/replace all newlines from a string s:

''.join(s.splitlines())

The following removes exactly one trailing newline (as chomp would, I believe). Passing True as the keepends argument to splitlines retain the delimiters. Then, splitlines is called again to remove the delimiters on just the last "line":

def chomp(s):
    if len(s):
        lines = s.splitlines(True)
        last = lines.pop()
        return ''.join(lines + last.splitlines())
    else:
        return ''

@internetional 2016-11-22 20:17:58

There are three types of line endings that we normally encounter: \n, \r and \r\n. A rather simple regular expression in re.sub, namely r"\r?\n?$", is able to catch them all.

(And we gotta catch 'em all, am I right?)

import re

re.sub(r"\r?\n?$", "", the_text, 1)

With the last argument, we limit the number of occurences replaced to one, mimicking chomp to some extent. Example:

import re

text_1 = "hellothere\n\n\n"
text_2 = "hellothere\n\n\r"
text_3 = "hellothere\n\n\r\n"

a = re.sub(r"\r?\n?$", "", text_1, 1)
b = re.sub(r"\r?\n?$", "", text_2, 1)
c = re.sub(r"\r?\n?$", "", text_3, 1)

... where a == b == c is True.

@Agostino 2016-11-22 21:20:43

You don't even need full fledged regular expressions. rstrip("\r\n") is a catch-all. Try print(text_2.rstrip('\r\n')).

@internetional 2016-11-22 21:39:23

@Agostino : True, given that str.rstrip() solves the problem. It depends on which needs you have. This solution is specifically made for the cases when you need to remove just the last "\n", "\r" or "\r\n" but not all of them (if there are multiple "\n" in the string). re.sub(r"\r?\n?$", "", text_1, 1) returns "hellothere\n\n" and text_1.rstrip("\r\n") returns "hellothere" which is a different string.

@internetional 2016-11-22 21:43:08

What I am trying to say is: that str.strip() is a catch-all is sometimes the very problem.

@user7121455 2016-11-22 18:30:37

>>> '   spacious   '.rstrip()
'   spacious'
>>> "AABAA".rstrip("A")
  'AAB'
>>> "ABBA".rstrip("AB") # both AB and BA are stripped
   ''
>>> "ABCABBA".rstrip("AB")
   'ABC'

@Agostino 2016-11-22 21:11:34

The example I needed! So rstrip("\r\n") will strip both '\n' and '\r' in any combination at the end of the line!

@olibre 2017-05-11 19:31:17

@Agostino No need to provide "\r\n" For example: ' spacious \n\r\n\r \n\n'.rstrip() produces ' spacious'

@Agostino 2017-05-15 19:15:24

@olibre the code you suggest will also strip other blank/space characters, which might not be what one needs. In fact, I only needed to strip combinations of eol characters. Still, thanks for pointing this out.

@Hackaholic 2014-11-21 04:29:07

you can use strip:

line = line.strip()

demo:

>>> "\n\n hello world \n\n".strip()
'hello world'

@Tarik 2015-03-01 06:03:22

Tried this solution but it strips off leading blanks in the line.

@Hackaholic 2015-03-04 23:08:02

@Tarik you can use rstrip

@Flimm 2016-06-30 16:18:47

rstrip will delete all trailing whitespace, unlike chomp which only deletes at most one newline.

@slec 2015-03-09 08:02:55

s = s.rstrip()

will remove all newlines at the end of the string s. The assignment is needed because rstrip returns a new string instead of modifying the original string.

@kiriloff 2013-05-13 16:41:22

You may use line = line.rstrip('\n'). This will strip all newlines from the end of the string, not just one.

@Jamie 2010-03-07 16:07:27

I might use something like this:

import os
s = s.rstrip(os.linesep)

I think the problem with rstrip("\n") is that you'll probably want to make sure the line separator is portable. (some antiquated systems are rumored to use "\r\n"). The other gotcha is that rstrip will strip out repeated whitespace. Hopefully os.linesep will contain the right characters. the above works for me.

@apiguy 2012-01-18 18:50:39

This won't work however if you are trying to clean up user submitted content in a web application. The user content could come from any source and contain any newline chars.

@ChuckCottrill 2016-02-06 02:56:00

Good point, except that you may be processing 'foreign' files (from antiquated systems) on your modern os.

@Mad Physicist 2017-04-28 19:55:22

Keep in mind also that if you are reading a file in text mode, this won't work on a Windows system either, because the trailing character will always be converted to '\n'.

@dtauxe 2019-04-18 20:06:34

@MadPhysicist You are right that it does convert it, but it still works because it is the same as rstrip('\r\n') and rstrip() will strip any characters that are in the argument.

@Sameer Siruguri 2008-11-28 17:31:34

Note that rstrip doesn't act exactly like Perl's chomp() because it doesn't modify the string. That is, in Perl:

$x="a\n";

chomp $x

results in $x being "a".

but in Python:

x="a\n"

x.rstrip()

will mean that the value of x is still "a\n". Even x=x.rstrip() doesn't always give the same result, as it strips all whitespace from the end of the string, not just one newline at most.

@kostmo 2010-03-29 20:17:17

Also, strip() removes repeated characters, whereas chop/chomp only removes one newline

@Brian Peterson 2013-09-25 03:35:57

Ah yes, strings are immutable, thanks for the reminder!

@Mike 2008-11-09 00:11:21

The canonical way to strip end-of-line (EOL) characters is to use the string rstrip() method removing any trailing \r or \n. Here are examples for Mac, Windows, and Unix EOL characters.

>>> 'Mac EOL\r'.rstrip('\r\n')
'Mac EOL'
>>> 'Windows EOL\r\n'.rstrip('\r\n')
'Windows EOL'
>>> 'Unix EOL\n'.rstrip('\r\n')
'Unix EOL'

Using '\r\n' as the parameter to rstrip means that it will strip out any trailing combination of '\r' or '\n'. That's why it works in all three cases above.

This nuance matters in rare cases. For example, I once had to process a text file which contained an HL7 message. The HL7 standard requires a trailing '\r' as its EOL character. The Windows machine on which I was using this message had appended its own '\r\n' EOL character. Therefore, the end of each line looked like '\r\r\n'. Using rstrip('\r\n') would have taken off the entire '\r\r\n' which is not what I wanted. In that case, I simply sliced off the last two characters instead.

Note that unlike Perl's chomp function, this will strip all specified characters at the end of the string, not just one:

>>> "Hello\n\n\n".rstrip("\n")
"Hello"

@Peter Hosey 2008-11-09 06:15:11

Note that modern Mac OS X apps use \n. Only old Carbon apps originally written for Mac OS use \r.

@Mike 2008-11-09 11:35:13

Thanks for the clarification. Of course, the rstrip('\r\n') still works in that case too.

@Eli Collins 2011-08-15 13:44:12

There's also os.linesep, which contains the EOL sequence for the current OS.

@kevinarpe 2015-02-12 03:58:23

This is the best answer: It only strips newlines, and does it correctly for the most common platforms.

@fechnert 2015-05-28 15:35:57

plus +1 For using \n and \r

@jfs 2016-01-25 06:56:14

@Tim: normally, you won't see \r in in the input because Python uses universal newlines mode by default ('\n', '\r', or '\r\n' are translated to '\n' while reading). If you want to take into account Unicode newlines; see str.splitlines().

@Robert Sim 2017-09-28 17:43:37

Upvoting because this is much safer than rstrip() with no arguments.

@minopret 2013-10-23 01:32:11

An example in Python's documentation simply uses line.strip().

Perl's chomp function removes one linebreak sequence from the end of a string only if it's actually there.

Here is how I plan to do that in Python, if process is conceptually the function that I need in order to do something useful to each line from this file:

import os
sep_pos = -len(os.linesep)
with open("file.txt") as f:
    for line in f:
        if line[sep_pos:] == os.linesep:
            line = line[:sep_pos]
        process(line)

@Ciro Santilli 新疆改造中心法轮功六四事件 2015-09-26 21:45:06

Finally, an answer that only removes it once (like the actual chomp...) and is OS portable!

@Help me 2016-05-20 12:29:21

Just use :

line = line.rstrip("\n")

or

line = line.strip("\n")

You don't need any of this complicated stuff

@Flimm 2016-06-30 16:20:31

Note that this is not the same as chomp.

@Andrew Grimm 2009-07-03 01:49:19

I don't program in Python, but I came across an FAQ at python.org advocating S.rstrip("\r\n") for python 2.2 or later.

@Stephen Miller 2015-10-28 13:56:42

If you are concerned about speed (say you have a looong list of strings) and you know the nature of the newline char, string slicing is actually faster than rstrip. A little test to illustrate this:

import time

loops = 50000000

def method1(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string[:-1]
    t1 = time.time()
    print('Method 1: ' + str(t1 - t0))

def method2(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string.rstrip()
    t1 = time.time()
    print('Method 2: ' + str(t1 - t0))

method1()
method2()

Output:

Method 1: 3.92700004578
Method 2: 6.73000001907

@Stephen Miller 2015-10-28 14:21:22

I know I should probably use "global loops" inside of the functions, but this works as well.

@spky 2016-05-24 21:17:37

This test is wrong and not fair.. In method1 you are just chopping off the last character, no matter what, in method2 the .rstrip() first checks, if the end of the String contains undesired characters and chops them off, only if some were found. Please implement some check for characters in method1 and test agin!

@Stephen Miller 2016-05-29 13:30:24

As I said in the intro to the answer: If you know the nature of the newline char, then this is useful. If you don't then yes, you obviously need to implement some sort of character check - or just use rstrip. I did not mean to be "unfair" to rstrip, but simply illustrate a not so insignificant difference that may be worth considering in some situations.

@user4178860 2014-10-24 18:34:12

A catch all:

line = line.rstrip('\r|\n')

@Flimm 2016-06-30 16:20:15

rstrip does not take regular expression. "hi|||\n\n".rstrip("\r|\n") returns "hi"

@user1151618 2014-01-20 19:07:03

import re

r_unwanted = re.compile("[\n\t\r]")
r_unwanted.sub("", your_text)

@NoahR 2014-11-10 21:55:57

This is also going to remove tab whitespace, which the original question does not request. ( Due to the \t character )

@Leozj 2013-10-11 11:56:21

If your question is to clean up all the line breaks in a multiple line str object (oldstr), you can split it into a list according to the delimiter '\n' and then join this list into a new str(newstr).

newstr = "".join(oldstr.split('\n'))

@mihaicc 2012-02-29 22:40:11

"line 1\nline 2\r\n...".replace('\n', '').replace('\r', '')
>>> 'line 1line 2...'

or you could always get geekier with regexps :)

have fun!

@Steve Koch 2013-01-20 16:27:20

This worked great for me trying to quickly turn a text file with line endings into one line of text. I'm a newbie, so not sure if there's a better way to do it, but it worked, thanks! (Strip seemed to only work from the ends, not internally)

@Doorknob 2013-07-07 18:19:22

Why not just use one replace statement, like .replace('\n|\r', '')?

@Taylor Edmiston 2014-02-09 01:50:35

Just in case anyone else wants to use the idea from @DoorknobofSnow, it's just a small change to use the regex module: import re re.sub('\n|\r', '', '\nx\n\r\n') ==> 'x'.

@Bhargav 2017-07-25 14:59:59

Using this and regex technique as @TaylorEdmiston mentioned should be the proper answer.

@Taylor Edmiston 2017-07-27 05:26:07

@Bhargav I've added an answer to this question based upon this comment as you suggested while also exploring a few other related options. I also clarified why I think regex is a better solution to this problem than str.rstrip since that's what most answers use.

@Chij 2011-11-30 14:04:19

workaround solution for special case:

if the newline character is the last character (as is the case with most file inputs), then for any element in the collection you can index as follows:

foobar= foobar[:-1]

to slice out your newline character.

@Cacovsky 2012-06-01 19:14:17

Sometimes the newline is not a last character, but the last ones, specially on windows, as others have pointed out.

@Carlos Valiente 2011-04-27 11:43:20

Careful with "foo".rstrip(os.linesep): That will only chomp the newline characters for the platform where your Python is being executed. Imagine you're chimping the lines of a Windows file under Linux, for instance:

$ python
Python 2.7.1 (r271:86832, Mar 18 2011, 09:09:48) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys
>>> sys.platform
'linux2'
>>> "foo\r\n".rstrip(os.linesep)
'foo\r'
>>>

Use "foo".rstrip("\r\n") instead, as Mike says above.

@Flimm 2016-06-30 16:16:26

The other thing to note is that it does not remove at most one newline, but all newlines, unlike chomp.

@ingydotnet 2011-04-23 12:42:25

rstrip doesn't do the same thing as chomp, on so many levels. Read http://perldoc.perl.org/functions/chomp.html and see that chomp is very complex indeed.

However, my main point is that chomp removes at most 1 line ending, whereas rstrip will remove as many as it can.

Here you can see rstrip removing all the newlines:

>>> 'foo\n\n'.rstrip(os.linesep)
'foo'

A much closer approximation of typical Perl chomp usage can be accomplished with re.sub, like this:

>>> re.sub(os.linesep + r'\Z','','foo\n\n')
'foo\n'

@brianmearns 2012-08-15 01:22:56

Kudos, you're the only one that pointed out this very important detail. However, as someone above noted, using os.linesep won't work if you're reading files from a different system. This might take a bit more work in Python, actually inspecting the end of the line.

@Ryan Ginstrom 2008-11-09 05:52:43

And I would say the "pythonic" way to get lines without trailing newline characters is splitlines().

>>> text = "line 1\nline 2\r\nline 3\nline 4"
>>> text.splitlines()
['line 1', 'line 2', 'line 3', 'line 4']

Related Questions

Sponsored Content

42 Answered Questions

[SOLVED] How do I merge two dictionaries in a single expression?

22 Answered Questions

[SOLVED] Echo newline in Bash prints literal \n

  • 2011-12-11 21:01:54
  • Sergey
  • 1932481 View
  • 2190 Score
  • 22 Answer
  • Tags:   bash echo newline

21 Answered Questions

[SOLVED] How to print without newline or space?

44 Answered Questions

[SOLVED] Replacements for switch statement in Python?

11 Answered Questions

[SOLVED] How to replace a character by a newline in Vim

12 Answered Questions

[SOLVED] Calling a function of a module by using its name (a string)

  • 2008-08-06 03:36:08
  • ricree
  • 639377 View
  • 1593 Score
  • 12 Answer
  • Tags:   python object

11 Answered Questions

[SOLVED] How do I get a substring of a string in Python?

  • 2009-03-19 17:29:41
  • Joan Venge
  • 2740701 View
  • 2010 Score
  • 11 Answer
  • Tags:   python string

19 Answered Questions

[SOLVED] How to remove an element from a list by index?

  • 2009-03-09 18:16:11
  • Joan Venge
  • 2424875 View
  • 1393 Score
  • 19 Answer
  • Tags:   python list

31 Answered Questions

[SOLVED] How do I check if a string is a number (float)?

25 Answered Questions

[SOLVED] How can I safely create a nested directory?

Sponsored Content