By blue928


2013-04-09 04:55:46 8 Comments

I am having a lot of trouble with the inefficiency of node_save(). But is node save my problem? That's ultimately what I am trying to find out.

I created a loop with 100,000 iterations. I created the bare minimum for the node object to be valid and save correctly. Here is the node save code:

$node = new stdClass();
        $node->type = "test_page";

        node_object_prepare($node);

        $node->uid = 1;
        $node->title = $node_title;
        $node->status = 1;
        $node->language = LANGUAGE_NONE;
        if($node = node_submit($node)){
            node_save($node);
}

Here are the results:

100,000 nodes were saved, each using node_save(). It took 5196.22 seconds to complete. That is ONLY 19 saves a second.

To say the least, that is not acceptable, especially when this person is getting around 1200 individual insert queries per second, and this person is getting 25,000 inserts per second.

So, what's going on here? Where is the bottleneck? Is it the with the node_save() function and how it's designed?

Could it be my hardware? My hardware is a development server, no one on it except for me - Intel dual core, 3Ghz, Ubuntu 12.04 with 16 gigs of ram.

While the loop runs my resource usage is: MySQL 27% CPU, 6M RAM; PHP 22% CPU 2M RAM.

My mysql configuration was done by the percona wizard.

Mysql says that if my CPU usage is under 70% my problem is disk bound. Granted, I have only a run of the mill WD Caviar 7200 RPM, but I should be getting more than 19 inserts a sec with it I hope!

Not too long ago I wrote about saving 30,000 nodes in a day. However, to be clear, this node has nothing to do with any external forces. It's purely a benchmark to learn about how to increase the speed of calls to node_save().

Realistically, I need to get 30,000 items into the database every minute using node_save. If node save is not an option, I wonder if I can write my own drupal api function "node_batch_save()" or something that takes advantage of mysql's ability to to do bulk inserts with the INSERT query. Thoughts on how to approach this?

3 comments

@kenorb 2013-10-04 14:18:33

First of all, install XCache/APC (for PHP < 5.5) and configure memcached for Drupal.

Then you can optimize your MySQL configuration for heavy queries by using mysqltuner script available at: http://mysqltuner.pl

E.g.

# performance tweaks (adjusted based on mysqltuner.pl)
query_cache_size = 32M
query_cache_limit = 256M
join_buffer_size = 32M
key_buffer = 8M
max_allowed_packet = 32M
table_cache = 512
sort_buffer_size = 1M
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 1M
myisam_sort_buffer_size = 8M

# When making adjustments, make tmp_table_size/max_heap_table_size equal
tmp_table_size = 16M
max_heap_table_size = 16M

thread_cache_size = 4

Other suggestions:

  • disable modules which you don't need (e.g. Devel, core Database Logging module, etc.),
  • upgrade your PHP to the latest or higher branch,
  • recompile your PHP for 64-bit or higher architecture depending on your CPU,
  • use the faster storage device for your db files or whole LAMP environment (e.g. SSD or memory-based filesystem),
  • use PHP debugger or profiler to find out any performance bottleneck (e.g. XDebug Profiler, DTrace or NuSphere PhpED PHP Profiler),
  • run some time-consuming drush command under gprof profiling tool, so you can find some performance bottleneck as well

@John McCollum 2014-02-04 16:03:29

Tuning MySQL seems to make a big difference. I went from about 80 node_saves a minute to about 700 just by following the tips given by mysqltuner.pl.

@giorgio79 2013-06-11 09:25:04

Use Mongodb module to store fields https://drupal.org/project/mongodb Results here: as per http://cyrve.com/mongodb

@Bojan Zivanovic 2013-04-09 09:18:57

You will never get 30 000 inserts a minute using node_save. No way.

An INSERT is fast because that's all it does. Node save does multiple inserts (main table, revision table, a table for each field), clears any entity caches, and fires hooks. The hooks are the tricky part. If you have many contrib modules (or even one that misbehaves) that can really kill performance, especially if the author didn't account for the "I am saving a ton of nodes at once" use case. For instance, I had to add this to my Migrate class:

  public function processImport(array $options = array()) {
    parent::processImport($options = array());
    // Do not force menu rebuilding. Otherwise pathauto will try to rebuild
    // in each node_save() invocation.
    variable_set('menu_rebuild_needed', FALSE);
  }

On the other hand, if you write a custom save function that invokes no hooks, you are in clear danger of getting inconsistent data, in a state that is unexpected by the system. I would never recommend doing that. Fire up xhprof and see what's happening.

@blue928 2013-04-09 09:30:47

Some of the migration modules out there, how do they end up bulk saving nodes? I mean, at the end of it all, it all boils down to an INSERT statement, right? How does your migration class ultimately insert from 'source' to 'target' when not using node save but still needing to maintain data integrity across tables?

@Alfred Armstrong 2013-04-09 09:35:51

All migration modules I have come across do use a node_save.

@Clive 2013-04-09 09:35:58

@blue928 He's saying he does use node_save(), but adds some code to mitigate known problems that can be caused, like Pathauto rebuilding the menu cache after every node save

@blue928 2013-04-09 09:54:40

ah, Ok, I see. Bojan is your code available in a module or online where I could see how you have dealt with bottlenecks like path auto? Good idea with the xhprof. I'll check into that.

Related Questions

Sponsored Content

1 Answered Questions

problems with entity translation insert hook

1 Answered Questions

[SOLVED] Mark deletion on custom fields

  • 2015-06-09 05:47:28
  • Wordzilla
  • 35 View
  • 0 Score
  • 1 Answer
  • Tags:   7 nodes entities

1 Answered Questions

[SOLVED] How to speed up performance of taxonomy hierarchical select drupal

  • 2015-07-21 06:52:30
  • Kamal Oberoi
  • 472 View
  • 0 Score
  • 1 Answer
  • Tags:   performance

1 Answered Questions

[SOLVED] Why node custom fields weren't saved when adding node programmatically?

  • 2012-12-14 22:25:15
  • Codium
  • 1616 View
  • 4 Score
  • 1 Answer
  • Tags:   7 entities nodes

1 Answered Questions

[SOLVED] Drupal 7 Core Module Function for node_save?

  • 2012-10-07 10:38:46
  • 夏期劇場
  • 228 View
  • 0 Score
  • 1 Answer
  • Tags:   7 nodes

2 Answered Questions

[SOLVED] How can I save 30,000 nodes in less than a day?

  • 2012-09-30 23:27:24
  • blue928
  • 575 View
  • 2 Score
  • 2 Answer
  • Tags:   7 database nodes

1 Answered Questions

[SOLVED] How Reliable is node_save?

  • 2012-08-09 02:45:21
  • Cameron Ball
  • 639 View
  • 1 Score
  • 1 Answer
  • Tags:   nodes

Sponsored Content