Changeset 58314 in webkit


Ignore:
Timestamp:
Apr 27, 2010 10:50:03 AM (14 years ago)
Author:
eric@webkit.org
Message:

2010-04-27 Eric Seidel <eric@webkit.org>

Reviewed by Adam Barth.

[chromium] new-run-webkit-tests hangs on Chromium Bots (OS X and Linux)
https://bugs.webkit.org/show_bug.cgi?id=37987

After further research, I believe the hang is caused by:
http://bugs.python.org/issue2320
Basically Popen() is not reentrant.
The workaround is to pass close_fds=True to Popen() on Mac/Linux.

I fixed our main Popen wrapper "Executive.run_command" to use close_fds=True
when appropriate.

I audited all places we call Popen() and either moved them to run_command
or left a FIXME that they are not thread safe. A few places I added the
close_fds workaround there and left an explanitory note.

  • Scripts/webkitpy/common/checkout/scm_unittest.py:
    • Added note that this Popen use is not threadsafe.
  • Scripts/webkitpy/common/system/executive.py:
    • Fixed our Executive.run_* to workaround python bug 2320.
  • Scripts/webkitpy/common/system/user.py: _ Added note that this Popen use is not threadsafe.
  • Scripts/webkitpy/layout_tests/layout_package/json_results_generator.py: ditto.
  • Scripts/webkitpy/layout_tests/port/apache_http_server.py: ditto.
  • Scripts/webkitpy/layout_tests/port/base.py:
    • Change wdiff back to using run_command now that we believe it to be threadsafe.
  • Scripts/webkitpy/layout_tests/port/chromium.py:
    • Fix to use Executive in places.
    • Pass self._executive down to the Driver for easier unit testing.
  • Scripts/webkitpy/layout_tests/port/chromium_win.py:
    • Re-factor to use a _kill_all method.
    • Made the _kill_all method use run_command to be threadsafe.
  • Scripts/webkitpy/layout_tests/port/http_server.py:
    • Add FIXME about using Executive.
  • Scripts/webkitpy/layout_tests/port/server_process.py:
    • Use Executive to be threadsafe.
  • Scripts/webkitpy/layout_tests/port/webkit.py:
    • Pass self._executive down to the Driver.
  • Scripts/webkitpy/layout_tests/port/websocket_server.py:
    • Add note about Popen not being threadsafe.
  • Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:
    • Move one caller to run_command add notes about moving others.
Location:
trunk/WebKitTools
Files:
14 edited

Legend:

Unmodified
Added
Removed
  • trunk/WebKitTools/ChangeLog

    r58297 r58314  
     12010-04-27  Eric Seidel  <eric@webkit.org>
     2
     3        Reviewed by Adam Barth.
     4
     5        [chromium] new-run-webkit-tests hangs on Chromium Bots (OS X and Linux)
     6        https://bugs.webkit.org/show_bug.cgi?id=37987
     7
     8        After further research, I believe the hang is caused by:
     9        http://bugs.python.org/issue2320
     10        Basically Popen() is not reentrant.
     11        The workaround is to pass close_fds=True to Popen() on Mac/Linux.
     12
     13        I fixed our main Popen wrapper "Executive.run_command" to use close_fds=True
     14        when appropriate.
     15
     16        I audited all places we call Popen() and either moved them to run_command
     17        or left a FIXME that they are not thread safe.  A few places I added the
     18        close_fds workaround there and left an explanitory note.
     19
     20        * Scripts/webkitpy/common/checkout/scm_unittest.py:
     21         - Added note that this Popen use is not threadsafe.
     22        * Scripts/webkitpy/common/system/executive.py:
     23         - Fixed our Executive.run_* to workaround python bug 2320.
     24        * Scripts/webkitpy/common/system/user.py:
     25         _ Added note that this Popen use is not threadsafe.
     26        * Scripts/webkitpy/layout_tests/layout_package/json_results_generator.py: ditto.
     27        * Scripts/webkitpy/layout_tests/port/apache_http_server.py: ditto.
     28        * Scripts/webkitpy/layout_tests/port/base.py:
     29         - Change wdiff back to using run_command now that we believe it
     30           to be threadsafe.
     31        * Scripts/webkitpy/layout_tests/port/chromium.py:
     32         - Fix to use Executive in places.
     33         - Pass self._executive down to the Driver for easier unit testing.
     34        * Scripts/webkitpy/layout_tests/port/chromium_win.py:
     35         - Re-factor to use a _kill_all method.
     36         - Made the _kill_all method use run_command to be threadsafe.
     37        * Scripts/webkitpy/layout_tests/port/http_server.py:
     38         - Add FIXME about using Executive.
     39        * Scripts/webkitpy/layout_tests/port/server_process.py:
     40         - Use Executive to be threadsafe.
     41        * Scripts/webkitpy/layout_tests/port/webkit.py:
     42         - Pass self._executive down to the Driver.
     43        * Scripts/webkitpy/layout_tests/port/websocket_server.py:
     44         - Add note about Popen not being threadsafe.
     45        * Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py:
     46         - Move one caller to run_command add notes about moving others.
     47
    1482010-04-27  Adam Barth  <abarth@webkit.org>
    249
  • trunk/WebKitTools/Scripts/webkitpy/common/checkout/scm_unittest.py

    r58261 r58314  
    5555# Callers could use run_and_throw_if_fail(args, cwd=cwd, quiet=True)
    5656def run_silent(args, cwd=None):
     57    # Note: Not thread safe: http://bugs.python.org/issue2320
    5758    process = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=cwd)
    5859    process.communicate() # ignore output
  • trunk/WebKitTools/Scripts/webkitpy/common/system/executive.py

    r58036 r58314  
    8888class Executive(object):
    8989
     90    def _should_close_fds(self):
     91        # We need to pass close_fds=True to work around Python bug #2320
     92        # (otherwise we can hang when we kill DumpRenderTree when we are running
     93        # multiple threads). See http://bugs.python.org/issue2320 .
     94        # Note that close_fds isn't supported on Windows, but this bug only
     95        # shows up on Mac and Linux.
     96        return sys.platform not in ('win32', 'cygwin')
     97
    9098    def _run_command_with_teed_output(self, args, teed_output):
    9199        args = map(unicode, args)  # Popen will throw an exception if args are non-strings (like int())
    92100        child_process = subprocess.Popen(args,
    93101                                         stdout=subprocess.PIPE,
    94                                          stderr=subprocess.STDOUT)
     102                                         stderr=subprocess.STDOUT,
     103                                         close_fds=self._should_close_fds())
    95104
    96105        # Use our own custom wait loop because Popen ignores a tee'd
     
    178187    def _compute_stdin(self, input):
    179188        """Returns (stdin, string_to_communicate)"""
     189        # FIXME: We should be returning /dev/null for stdin
     190        # or closing stdin after process creation to prevent
     191        # child processes from getting input from the user.
    180192        if not input:
    181193            return (None, None)
     
    201213                    return_stderr=True,
    202214                    decode_output=True):
     215        """Popen wrapper for convenience and to work around python bugs."""
    203216        args = map(unicode, args)  # Popen will throw an exception if args are non-strings (like int())
    204217        stdin, string_to_communicate = self._compute_stdin(input)
     
    209222                                   stdout=subprocess.PIPE,
    210223                                   stderr=stderr,
    211                                    cwd=cwd)
     224                                   cwd=cwd,
     225                                   close_fds=self._should_close_fds())
    212226        output = process.communicate(string_to_communicate)[0]
    213227        # run_command automatically decodes to unicode() unless explicitly told not to.
  • trunk/WebKitTools/Scripts/webkitpy/common/system/user.py

    r56975 r58314  
    6868        pager = os.environ.get("PAGER") or "less"
    6969        try:
     70            # Note: Not thread safe: http://bugs.python.org/issue2320
    7071            child_process = subprocess.Popen([pager], stdin=subprocess.PIPE)
    7172            child_process.communicate(input=message)
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/layout_package/json_results_generator.py

    r58036 r58314  
    126126            results_file.close()
    127127
     128    # FIXME: Callers should use scm.py instead.
    128129    def _get_svn_revision(self, in_directory):
    129130        """Returns the svn revision for the given directory.
     
    133134        """
    134135        if os.path.exists(os.path.join(in_directory, '.svn')):
     136            # Note: Not thread safe: http://bugs.python.org/issue2320
    135137            output = subprocess.Popen(["svn", "info", "--xml"],
    136138                                      cwd=in_directory,
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/apache_http_server.py

    r58036 r58314  
    192192        # the sake of Window/Cygwin and it needs quoting that breaks
    193193        # shell=False.
     194        # FIXME: We should not need to be joining shell arguments into strings.
     195        # shell=True is a trail of tears.
     196        # Note: Not thread safe: http://bugs.python.org/issue2320
    194197        self._httpd_proc = subprocess.Popen(self._start_cmd,
    195198                                            stderr=subprocess.PIPE,
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/base.py

    r58279 r58314  
    5151
    5252
    53 # Python bug workaround.  See Port.wdiff_text() for an explanation.
     53# Python's Popen has a bug that causes any pipes opened to a
     54# process that can't be executed to be leaked.  Since this
     55# code is specifically designed to tolerate exec failures
     56# to gracefully handle cases where wdiff is not installed,
     57# the bug results in a massive file descriptor leak. As a
     58# workaround, if an exec failure is ever experienced for
     59# wdiff, assume it's not available.  This will leak one
     60# file descriptor but that's better than leaking each time
     61# wdiff would be run.
     62#
     63# http://mail.python.org/pipermail/python-list/
     64#    2008-August/505753.html
     65# http://bugs.python.org/issue3210
    5466_wdiff_available = True
    5567_pretty_patch_available = True
     
    536548               actual_filename,
    537549               expected_filename]
    538         # FIXME: Why not just check os.exists(executable) once?
    539         global _wdiff_available
     550        global _wdiff_available  # See explaination at top of file.
    540551        result = ''
    541552        try:
    542             # Python's Popen has a bug that causes any pipes opened to a
    543             # process that can't be executed to be leaked.  Since this
    544             # code is specifically designed to tolerate exec failures
    545             # to gracefully handle cases where wdiff is not installed,
    546             # the bug results in a massive file descriptor leak. As a
    547             # workaround, if an exec failure is ever experienced for
    548             # wdiff, assume it's not available.  This will leak one
    549             # file descriptor but that's better than leaking each time
    550             # wdiff would be run.
    551             #
    552             # http://mail.python.org/pipermail/python-list/
    553             #    2008-August/505753.html
    554             # http://bugs.python.org/issue3210
    555             #
    556             # It also has a threading bug, so we don't output wdiff if
    557             # the Popen raises a ValueError.
    558             # http://bugs.python.org/issue1236
    559553            if _wdiff_available:
    560                 try:
    561                     # FIXME: Use Executive() here.
    562                     wdiff = subprocess.Popen(cmd,
    563                         stdout=subprocess.PIPE).communicate()[0]
    564                 except ValueError, e:
    565                     # Working around a race in Python 2.4's implementation
    566                     # of Popen().
    567                     wdiff = ''
     554                wdiff = self._executive.run_command(cmd, decode_output=False)
    568555                wdiff = cgi.escape(wdiff)
    569556                wdiff = wdiff.replace('##WDIFF_DEL##', '<span class=del>')
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/chromium.py

    r58279 r58314  
    4545import http_server
    4646
     47from webkitpy.common.system.executive import Executive
     48
    4749# FIXME: To use the DRT-based version of this file, we need to be able to
    4850# run the webkit code, which uses server_process, which requires UNIX-style
     
    8183    """Abstract base class for Chromium implementations of the Port class."""
    8284
    83     def __init__(self, port_name=None, options=None):
    84         base.Port.__init__(self, port_name, options)
     85    def __init__(self, port_name=None, options=None, **kwargs):
     86        base.Port.__init__(self, port_name, options, **kwargs)
    8587        self._chromium_base_dir = None
    8688
     
    119121
    120122    def check_sys_deps(self, needs_http):
    121         dump_render_tree_binary_path = self._path_to_driver()
    122         proc = subprocess.Popen([dump_render_tree_binary_path,
    123                                 '--check-layout-test-sys-deps'])
    124         if proc.wait():
     123        cmd = [self._path_to_driver(), '--check-layout-test-sys-deps']
     124        if self._executive.run_command(cmd, return_exit_code=True):
    125125            _log.error('System dependencies check failed.')
    126126            _log.error('To override, invoke with --nocheck-sys-deps')
     
    177177            webbrowser.open(uri, new=1)
    178178        else:
     179            # Note: Not thread safe: http://bugs.python.org/issue2320
    179180            subprocess.Popen([self._path_to_driver(), uri])
    180181
     
    182183        """Starts a new Driver and returns a handle to it."""
    183184        if self._options.use_drt:
    184             return webkit.WebKitDriver(self, image_path, options)
    185         return ChromiumDriver(self, image_path, options)
     185            return webkit.WebKitDriver(self, image_path, options, exectuive=self._executive)
     186        return ChromiumDriver(self, image_path, options, exectuive=self._executive)
    186187
    187188    def start_helper(self):
     
    189190        if helper_path:
    190191            _log.debug("Starting layout helper %s" % helper_path)
     192            # Note: Not thread safe: http://bugs.python.org/issue2320
    191193            self._helper = subprocess.Popen([helper_path],
    192194                stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=None)
     
    274276    """Abstract interface for test_shell."""
    275277
    276     def __init__(self, port, image_path, options):
     278    def __init__(self, port, image_path, options, executive=Executive()):
    277279        self._port = port
    278280        self._configuration = port._options.configuration
     
    282284        self._options = options
    283285        self._image_path = image_path
     286        self._executive = executive
    284287
    285288    def start(self):
     
    403406                    _log.warning('stopping test driver timed out, '
    404407                                 'killing it')
    405                     # FIXME: This should use Executive.
    406                     null = open(os.devnull, "w")  # Does this need an encoding?
    407                     subprocess.Popen(["kill", "-9",
    408                                      str(self._proc.pid)], stderr=null)
    409                     null.close()
     408                    self._executive.kill_process(self._proc.pid)
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/chromium_win.py

    r58146 r58314  
    145145                                            'wdiff.exe')
    146146
     147    def _kill_all(self, process_name):
     148        cmd = ['taskkill.exe', '/f', '/im', process_name]
     149        self._executive.run_command(cmd)
     150
    147151    def _shut_down_http_server(self, server_pid):
    148152        """Shut down the lighttpd web server. Blocks until it's fully
     
    152156            server_pid: The process ID of the running server.
    153157        """
    154         subprocess.Popen(('taskkill.exe', '/f', '/im', 'LightTPD.exe'),
    155                         stdin=open(os.devnull, 'r'),
    156                         stdout=subprocess.PIPE,
    157                         stderr=subprocess.PIPE).wait()
    158         subprocess.Popen(('taskkill.exe', '/f', '/im', 'httpd.exe'),
    159                         stdin=open(os.devnull, 'r'),
    160                         stdout=subprocess.PIPE,
    161                         stderr=subprocess.PIPE).wait()
     158        # FIXME: Why are we ignoring server_pid and calling
     159        # _kill_all instead of Executive.kill_process(pid)?
     160        self._kill_all("LightTPD.exe")
     161        self._kill_all("httpd.exe")
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/http_server.py

    r58146 r58314  
    211211            setup_mount = self._port_obj.path_from_chromium_base('third_party',
    212212                'cygwin', 'setup_mount.bat')
     213            # FIXME: Should use Executive.run_command
    213214            subprocess.Popen(setup_mount).wait()
    214215
    215216        _log.debug('Starting http server')
     217        # FIXME: Should use Executive.run_command
    216218        self._process = subprocess.Popen(start_cmd, env=env)
    217219
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/server_process.py

    r58036 r58314  
    3939import time
    4040
     41from webkitpy.common.system.executive import Executive
     42
    4143_log = logging.getLogger("webkitpy.layout_tests.port.server_process")
    4244
     
    4951    as necessary to keep issuing commands."""
    5052
    51     def __init__(self, port_obj, name, cmd, env=None):
     53    def __init__(self, port_obj, name, cmd, env=None, executive=Executive()):
    5254        self._port = port_obj
    5355        self._name = name
     
    5557        self._env = env
    5658        self._reset()
     59        self._executive = executive
    5760
    5861    def _reset(self):
     
    6770            raise ValueError("%s already running" % self._name)
    6871        self._reset()
     72        # close_fds is a workaround for http://bugs.python.org/issue2320
    6973        close_fds = sys.platform not in ('win32', 'cygwin')
    7074        self._proc = subprocess.Popen(self._cmd, stdin=subprocess.PIPE,
     
    216220                _log.warning('stopping %s timed out, killing it' %
    217221                             self._name)
    218                 # FIXME: This should use Executive.
    219                 null = open(os.devnull, "w")
    220                 subprocess.Popen(["kill", "-9",
    221                                   str(self._proc.pid)], stderr=null)
    222                 null.close()
     222                self._executive.kill_process(self._proc.pid)
    223223                _log.warning('killed')
    224224        self._reset()
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/webkit.py

    r58279 r58314  
    4646import webbrowser
    4747
     48from webkitpy.common.system.executive import Executive
     49
    4850import webkitpy.common.system.ospath as ospath
    4951import webkitpy.layout_tests.port.base as base
     
    5658    """WebKit implementation of the Port class."""
    5759
    58     def __init__(self, port_name=None, options=None):
    59         base.Port.__init__(self, port_name, options)
     60    def __init__(self, port_name=None, options=None, **kwargs):
     61        base.Port.__init__(self, port_name, options, **kwargs)
    6062        self._cached_build_root = None
    6163        self._cached_apache_path = None
     
    196198
    197199    def create_driver(self, image_path, options):
    198         return WebKitDriver(self, image_path, options)
     200        return WebKitDriver(self, image_path, options, executive=self._executive)
    199201
    200202    def test_base_platform_names(self):
     
    348350    """WebKit implementation of the DumpRenderTree interface."""
    349351
    350     def __init__(self, port, image_path, driver_options):
     352    def __init__(self, port, image_path, driver_options, executive=Executive()):
    351353        self._port = port
    352354        # FIXME: driver_options is never used.
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/port/websocket_server.py

    r58147 r58314  
    208208        _log.debug('cmdline: %s' % ' '.join(start_cmd))
    209209        # FIXME: We should direct this call through Executive for testing.
     210        # Note: Not thread safe: http://bugs.python.org/issue2320
    210211        self._process = subprocess.Popen(start_cmd,
    211212                                         stdin=open(os.devnull, 'r'),
  • trunk/WebKitTools/Scripts/webkitpy/layout_tests/rebaseline_chromium_webkit_tests.py

    r58036 r58314  
    5959import zipfile
    6060
     61from webkitpy.common.system.executive import run_command
     62
    6163import port
    6264from layout_package import test_expectations
     
    9799
    98100    # Use a shell for subcommands on Windows to get a PATH search.
     101    # FIXME: shell=True is a trail of tears, and should be removed.
    99102    use_shell = sys.platform.startswith('win')
     103    # Note: Not thread safe: http://bugs.python.org/issue2320
    100104    p = subprocess.Popen(command, stdout=subprocess.PIPE,
    101105                         stderr=subprocess.STDOUT, shell=use_shell)
     
    282286        return self._rebaselining_tests
    283287
     288    # FIXME: Callers should use scm.py instead.
    284289    def _get_repo_type(self):
    285290        """Get the repository type that client is using."""
    286         output, return_code = run_shell_with_return_code(['svn', 'info'],
    287                                                          False)
     291        return_code = run_command(['svn', 'info'], return_exit_code=True)
    288292        if return_code == 0:
    289293            return REPO_SVN
     
    609613            _log.info('No test was rebaselined so nothing to remove.')
    610614
     615    # FIXME: Callers should move to SCM.add instead.
    611616    def _svn_add(self, filename):
    612617        """Add the file to SVN repository.
Note: See TracChangeset for help on using the changeset viewer.