By Shriram


2013-06-17 05:24:37 8 Comments

How do I search and replace text in a file using Python 3?

Here is my code:

import os
import sys
import fileinput

print ("Text to search for:")
textToSearch = input( "> " )

print ("Text to replace it with:")
textToReplace = input( "> " )

print ("File to perform Search-Replace on:")
fileToSearch  = input( "> " )
#fileToSearch = 'D:\dummy1.txt'

tempFile = open( fileToSearch, 'r+' )

for line in fileinput.input( fileToSearch ):
    if textToSearch in line :
        print('Match Found')
    else:
        print('Match Not Found!!')
    tempFile.write( line.replace( textToSearch, textToReplace ) )
tempFile.close()


input( '\n\n Press Enter to exit...' )

Input file:

hi this is abcd hi this is abcd
This is dummy text file.
This is how search and replace works abcd

When I search and replace 'ram' by 'abcd' in above input file, it works as a charm. But when I do it vice-versa i.e. replacing 'abcd' by 'ram', some junk characters are left at the end.

Replacing 'abcd' by 'ram'

hi this is ram hi this is ram
This is dummy text file.
This is how search and replace works rambcd

13 comments

@iknowitwasyoufredo 2019-06-14 13:00:46

With a single with block, you can search and replace your text:

with open('file.txt','r+') as f:
    filedata = f.read()
    filedata = filedata.replace('abc','xyz')
    f.truncate(0)
    f.write(filedata)

@ur. 2019-07-25 08:55:42

You forgot to seek to the beginning of the file before writing it. truncate doesn't do that and so you will have garbage in the file.

@jfs 2013-12-15 10:47:01

fileinput already supports inplace editing. It redirects stdout to the file in this case:

#!/usr/bin/env python3
import fileinput

with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
    for line in file:
        print(line.replace(text_to_search, replacement_text), end='')

@egpbos 2014-04-01 13:40:28

What is the end='' argument supposed to do?

@jfs 2014-04-01 13:46:11

line already has a newline. end is a newline by default, end='' makes print() function do not print additional newline

@egpbos 2014-04-02 13:24:24

Ah, I see now that it's Python 3 syntax. Was getting invalid syntax in Python 2, obviously.

@jfs 2014-04-02 15:41:07

yes. The question has python-3.x tag

@craigds 2014-12-18 03:09:21

Don't use fileinput! Consider writing the code to do this yourself instead. Redirecting sys.stdout isn't a great idea, especially if you're doing it without a try..finally like fileinput does. If an exception gets raised, your stdout might never get restored.

@jfs 2014-12-18 13:16:57

@craigds: wrong. fileinput is not a tool for all jobs (nothing is) but there are many cases where it is the right tool e.g., to implement a sed-like filter in Python. Don't use a screwdriver to pound nails.

@craigds 2014-12-18 22:06:45

If you really want to redirect stdout to your file for some reason, it's not hard to do it better than fileinput does (basically, use try..finally or a contextmanager to ensure you set stdout back to it's original value afterwards). The source code for fileinput is pretty eye-bleedingly awful, and it does some really unsafe things under the hood. If it were written today I very much doubt it would have made it into the stdlib.

@jfs 2014-12-18 22:33:41

@craigds: I don't see the benefit of reimplemening the diamond operator everytime I need it. Don't optimize prematurely. And if you don't like the implementation; submit a patch.

@Guillaume Gendre 2014-12-21 14:10:21

there is an alternative solution for the ", end=''". You could add .rstrip() at the end of your replaces to avoid double newlines

@jfs 2014-12-21 15:56:30

@GuillaumeGendre: rstrip() might remove too much e.g., trailing whitespace. end="" is a cleaner solution.

@Jayesh Bhoi 2015-08-18 10:02:45

@GuillaumeGendre Your solution work.

@Suresh 2015-10-31 19:33:36

This solution works great but the only caveat is it rewrites every single line. What I mean is if you run "diff" on old file vs new file, you'll notice that every line appears as modified. This matters a lot if the files are in svn, like in my case. Any workarounds?

@jfs 2015-10-31 20:16:46

@Suresh: it is probably related to the universal newlines mode (if your input file has newlines in non-native format for the system then they are normalized). Create a minimal input that demonstrates the issue e.g., open('file', 'wb').write(b'\r\n\n\r'), do search/replace using the code in the answer, and post the unexpected results (if any)(print(ascii(open('file', 'rb').read()))) along with the expected results as a new question.

@Suresh 2016-02-09 22:53:19

Sorry, I didn't see the comment earlier. I figured if I changed Here's how I dealt with it: for filename in __files__: \n tmp_name = filename + '.modified'`\n with codecs.open(filename, 'r', encoding='latin-1') as fi, \ codecs.open(tmp_name, 'w', encoding='latin-1') as fo: for line in fi: new_line = line.replace(oldv,newv) fo.write(new_line) os.remove(filename) os.rename(tmp_name, filename)

@jfs 2016-02-10 09:45:02

@Suresh : 1- comments is not an appropriate place to discuss possible solutions to a new question, ask a new question instead 2- Don't use codecs.open, use io.open instead.

@R__raki__ 2016-10-02 17:10:53

@J.F.Sebastian Hi Sebastian we can also use sub() method from re module, check out my answer to this question totally works.

@answerSeeker 2017-02-08 21:20:17

rstrip() fixes the problems that end="" creates in python 2.7

@jfs 2017-02-09 00:55:09

@answerSeeker: to enable end='', you could use from __future__ import print_function or at the very least use .rstrip('\n') instead of .rstrip(), to avoid removing too much whitespace from the line.

@Vitaliy Terziev 2017-05-10 09:35:43

good piece of code, here is the 2.7 version ->stackoverflow.com/questions/30835090/…

@jfs 2017-06-19 14:53:56

@Christophe Roussy read the question. Notice the names. Don't make such edits without a comment

@Christophe Roussy 2017-06-19 15:07:07

@J.F.Sebastian ok for question, but then the naming is bad in the question too as 'textToReplace' is the the text to replace in english, this is very confusing for beginners, but I understand why you kept the original

@Nitish Kumar Pal 2018-04-18 06:15:56

@jfs I use with open(os.path.join(path,file), "r", encoding = "utf-8") as file: to open file and avoid UnicodeDecodeError but in above case of FileInput(filename, inplace=True, backup='.bak') how am I suppose to avoid that please comment on that.

@jfs 2018-04-18 06:22:36

@NitishKumarPal if it is not clear from the documentation, ask a separate Stack Overflow question (how to specify the character encoding for FileInput)

@Ridhuvarshan 2018-06-06 02:09:12

What is the use of backup='.bak'? I couldn't find it anywhere in the documentation

@jfs 2018-06-06 05:15:56

@Ridhuvarshan open the fileinput documentation, search for the word "backup" e.g., follow the link then press Ctrl+f and start typing the word backup. If it fails; ask a separate Stack Overflow question.

@Jack Aidley 2013-06-17 06:29:50

As pointed out by michaelb958, you cannot replace in place with data of a different length because this will put the rest of the sections out of place. I disagree with the other posters suggesting you read from one file and write to another. Instead, I would read the file into memory, fix the data up, and then write it out to the same file in a separate step.

# Read in the file
with open('file.txt', 'r') as file :
  filedata = file.read()

# Replace the target string
filedata = filedata.replace('ram', 'abcd')

# Write the file out again
with open('file.txt', 'w') as file:
  file.write(filedata)

Unless you've got a massive file to work with which is too big to load into memory in one go, or you are concerned about potential data loss if the process is interrupted during the second step in which you write data to the file.

@jfs 2013-12-15 10:58:51

with file = open(..): is not valid Python (=) though the intent is clear. .replace() doesn't modify the string (it is immutable) so you need to use the returned value. Anyway the code that supports big files can be even simpler unless you need to search and replace text that spans multiple lines.

@Jack Aidley 2013-12-15 16:32:46

You're quite right, and that - folks - is why you should test your code before embarassing yourself on the internet ;)

@Jonas Stein 2016-04-16 13:31:15

The file should be closed in the end with file.close()

@Jack Aidley 2016-04-16 21:53:51

@JonasStein: No, it shouldn't. The with statement automatically closes the file at the end of the statement block.

@Jonas Stein 2016-04-17 10:41:44

@JackAidley that is interesting. Thank you for the explanation.

@Anekdotin 2016-11-15 17:56:53

@JackAidley elegant and easy solution

@diek 2017-07-31 15:23:16

I just used this, it worked perfectly on a large sql file.

@Jack Aidley 2018-03-12 10:27:41

I am curious as to why this answer, more than any other I have given, still receives regular upvotes long after I posted it.

@user3167654 2018-04-04 16:45:19

@JackAidley Do we need to worry about memory consumption using this method? especially if the file is very large

@Jack Aidley 2018-04-04 17:17:58

@user3167654 For a very large file, yes, you do. Also, if you have both a large file and need bomb-proof reliability this is not the right method since it can you leave without either the original file or the modified version if the write-back step is interrupted. However, for most uses I think it is appropriate.

@user3167654 2018-04-05 03:21:02

@JackAidley thank you

@Ben Barden 2018-09-18 15:47:29

@JackAidley because it is short, simple, easily used and understood, and addresses a real problem that a lot of people have (and therefore that a lot of people search for - thus finding your answer).

@Yuya Takashina 2019-02-08 02:24:59

You can also use pathlib.

from pathlib2 import Path
path = Path(file_to_search)
text = path.read_text()
text = text.replace(text_to_search, replacement_text)
path.write_text(text)

@JAGJ jdfoxito 2019-05-07 17:29:15

F:\FOLDER>fart BIGFILE "\"" "" --remove

@Deepak G 2018-06-20 10:06:18

def findReplace(find, replace):

import os 

src = os.path.join(os.getcwd(), os.pardir) **`//To get the folder in which files are present`** 

for path, dirs, files in os.walk(os.path.abspath(src)):

    for name in files: 

        if name.endswith('.py'): 

            filepath = os.path.join(path, name)

            with open(filepath) as f: 

                s = f.read() 

            s = s.replace(find, replace) 

            with open(filepath, "w") as f:

                f.write(s) 

@Vinit Pillai 2018-01-23 18:45:54

def word_replace(filename,old,new):
    c=0
    with open(filename,'r+',encoding ='utf-8') as f:
        a=f.read()
        b=a.split()
        for i in range(0,len(b)):
            if b[i]==old:
                c=c+1
        old=old.center(len(old)+2)
        new=new.center(len(new)+2)
        d=a.replace(old,new,c)
        f.truncate(0)
        f.seek(0)
        f.write(d)
    print('All words have been replaced!!!')

@Vinit Pillai 2018-01-23 18:47:37

This code will replace the word you intend. the only problem is it rewrites the whole file. might get stuck if the file is too long for the processor to handle.

@Doc5506 2017-09-24 16:57:20

I modified Jayram Singh's post slightly in order to replace every instance of a '!' character to a number which I wanted to increment with each instance. Thought it might be helpful to someone who wanted to modify a character that occurred more than once per line and wanted to iterate. Hope that helps someone. PS- I'm very new at coding so apologies if my post is inappropriate in any way, but this worked for me.

f1 = open('file1.txt', 'r')
f2 = open('file2.txt', 'w')
n = 1  

# if word=='!'replace w/ [n] & increment n; else append same word to     
# file2

for line in f1:
    for word in line:
        if word == '!':
            f2.write(word.replace('!', f'[{n}]'))
            n += 1
        else:
            f2.write(word)
f1.close()
f2.close()

@Zelmik 2017-02-16 21:26:18

I have done this:

#!/usr/bin/env python3

import fileinput
import os

Dir = input ("Source directory: ")
os.chdir(Dir)

Filelist = os.listdir()
print('File list: ',Filelist)

NomeFile = input ("Insert file name: ")

CarOr = input ("Text to search: ")

CarNew = input ("New text: ")

with fileinput.FileInput(NomeFile, inplace=True, backup='.bak') as file:
    for line in file:
        print(line.replace(CarOr, CarNew), end='')

file.close ()

@Sergio 2019-01-31 11:15:16

Sad, but fileinput doen not work with inplace=True with utf-8.

@RJay khadka 2016-08-25 19:18:56

I recommend its worth checking it out this small program. Regular expressions are the way to go.

https://github.com/khranjan/pythonprogramming/tree/master/findandreplace

@Roman Marusyk 2016-08-25 19:44:01

While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes.

@Chris 2017-04-26 20:21:46

Realize this is the SO standard, but usually doesn't warrant a downvote and it seems especially pedantic in this case; there's 50 lines of code in the linked project with ideas I've found useful, and I don't think it better to duplicate the whole file here, hence my upvote.

@BartoszKP 2018-05-07 12:06:07

@Chris We should not duplicate the whole file here because it has "ideas". We should quote only the relevant part. Roman's request is not pedantic at all.

@Chris 2018-05-14 14:50:47

@BartoszKP I agree the answer could be improved, but I don't think contributors should be penalised with a down vote when an answer is useful but slightly flawed, because I'd rather they post a reference than nothing at all. A comment would suffice.

@Chris 2018-05-14 14:56:37

PS. To make this meta-conversation a little more useful, I think the useful lines are dictionary = {'source': 'replacement', ...}; robj = re.compile('|'.join(dictionary.keys())); result = robj.sub(lambda match: dictionary[match.group(0)], src_stream)!

@BartoszKP 2018-05-14 21:57:28

@Chris You are of course free to vote as you see fit :) Note however that the answer will be completely useless once the link is dead. So we should not discuss votes here, but whether the answer should be deleted or not.

@Neamerjell 2014-04-05 05:19:15

As Jack Aidley had posted and J.F. Sebastian pointed out, this code will not work:

 # Read in the file
filedata = None
with file = open('file.txt', 'r') :
  filedata = file.read()

# Replace the target string
filedata.replace('ram', 'abcd')

# Write the file out again
with file = open('file.txt', 'w') :
  file.write(filedata)`

But this code WILL work (I've tested it):

f = open(filein,'r')
filedata = f.read()
f.close()

newdata = filedata.replace("old data","new data")

f = open(fileout,'w')
f.write(newdata)
f.close()

Using this method, filein and fileout can be the same file, because Python 3.3 will overwrite the file upon opening for write.

@Diegomanas 2014-10-16 13:17:37

I believe the difference is here: filedata.replace('ram', 'abcd') Compared to: newdata = filedata.replace("old data","new data") Nothing to do with the "with" statement

@jfs 2015-01-30 20:05:01

1. why would you remove with-statement? 2. As stated in my answer, fileinput can work inplace -- it can replace data in same file (it uses a temporary file internally). The difference is that fileinput does not require to load the whole file into memory.

@Chris 2017-04-26 20:08:42

Just to save others revisiting Jack Aidley's answer, it has been corrected since this answer, so this one is now redundant (and inferior due to losing the neater with blocks).

@LiPi 2013-12-15 10:19:22

My variant, one word at a time on the entire file.

I read it into memory.

def replace_word(infile,old_word,new_word):
    if not os.path.isfile(infile):
        print ("Error on replace_word, not a regular file: "+infile)
        sys.exit(1)

    f1=open(infile,'r').read()
    f2=open(infile,'w')
    m=f1.replace(old_word,new_word)
    f2.write(m)

@icktoofay 2013-06-17 05:43:36

Your problem stems from reading from and writing to the same file. Rather than opening fileToSearch for writing, open an actual temporary file and then after you're done and have closed tempFile, use os.rename to move the new file over fileToSearch.

@michaelb958 2013-06-17 05:53:05

Friendly FYI (feel free to edit into the answer): The root cause is not being able to shorten the middle of a file in place. That is, if you search for 5 characters and replace with 3, the first 3 chars of the 5 searched for will be replaced; but the other 2 can't be removed, they'll just stay there. The temporary file solution removes these "leftover" characters by dropping them instead of writing them out to the temporary file.

@Jayram 2013-06-17 05:32:29

You can do the replacement like this

f1 = open('file1.txt', 'r')
f2 = open('file2.txt', 'w')
for line in f1:
    f2.write(line.replace('old_text', 'new_text'))
f1.close()
f2.close()

@dave 2016-01-01 08:46:10

This works beautifully.

Related Questions

Sponsored Content

84 Answered Questions

[SOLVED] How do I make the first letter of a string uppercase in JavaScript?

28 Answered Questions

[SOLVED] How to read a file line-by-line into a list?

3 Answered Questions

58 Answered Questions

[SOLVED] How do I read / convert an InputStream into a String in Java?

42 Answered Questions

[SOLVED] How do I find all files containing specific text on Linux?

37 Answered Questions

[SOLVED] How do I check whether a file exists without exceptions?

56 Answered Questions

[SOLVED] How to replace all occurrences of a string?

57 Answered Questions

[SOLVED] How do I include a JavaScript file in another JavaScript file?

16 Answered Questions

[SOLVED] How do I copy a file in Python?

11 Answered Questions

[SOLVED] How to replace a character by a newline in Vim

Sponsored Content