public inbox for gentoo-commits@lists.gentoo.org
 help / color / mirror / Atom feed
From: "Zac Medico" <zmedico@gentoo.org>
To: gentoo-commits@lists.gentoo.org
Subject: [gentoo-commits] proj/portage:master commit in: pym/portage/dbapi/
Date: Sun, 15 Oct 2017 00:59:07 +0000 (UTC)	[thread overview]
Message-ID: <1508028820.c5a2a0edc4f4b01b16a274268431fa21f7f678b2.zmedico@gentoo> (raw)

commit:     c5a2a0edc4f4b01b16a274268431fa21f7f678b2
Author:     Daniel Robbins <drobbins <AT> funtoo <DOT> org>
AuthorDate: Sat Oct 14 23:38:05 2017 +0000
Commit:     Zac Medico <zmedico <AT> gentoo <DOT> org>
CommitDate: Sun Oct 15 00:53:40 2017 +0000
URL:        https://gitweb.gentoo.org/proj/portage.git/commit/?id=c5a2a0ed

portdbapi: factor out _better_cache class

Better_cache -- now even better :) This version only scans individual
categories on-demand. I have addressed concerns about PMS-compliance by
enhancing the documentation so that developers are aware of what
assumptions to make (and not make) when using better_cache.

Closes: https://github.com/gentoo/portage/pull/219

 pym/portage/dbapi/porttree.py | 124 ++++++++++++++++++++++++------------------
 1 file changed, 71 insertions(+), 53 deletions(-)

diff --git a/pym/portage/dbapi/porttree.py b/pym/portage/dbapi/porttree.py
index 53edcd18f..f5979d2d0 100644
--- a/pym/portage/dbapi/porttree.py
+++ b/pym/portage/dbapi/porttree.py
@@ -16,7 +16,7 @@ portage.proxy.lazyimport.lazyimport(globals(),
 	'portage.package.ebuild.doebuild:doebuild',
 	'portage.util:ensure_dirs,shlex_split,writemsg,writemsg_level',
 	'portage.util.listdir:listdir',
-	'portage.versions:best,catpkgsplit,_pkgsplit@pkgsplit,ver_regexp,_pkg_str',
+	'portage.versions:best,catsplit,catpkgsplit,_pkgsplit@pkgsplit,ver_regexp,_pkg_str',
 )
 
 from portage.cache import volatile
@@ -103,6 +103,68 @@ class _dummy_list(list):
 		except ValueError:
 			pass
 
+
+class _better_cache(object):
+
+	"""
+	The purpose of better_cache is to locate catpkgs in repositories using ``os.listdir()`` as much as possible, which
+	is less expensive IO-wise than exhaustively doing a stat on each repo for a particular catpkg. better_cache stores a
+	list of repos in which particular catpkgs appear. Various dbapi methods use better_cache to locate repositories of
+	interest related to particular catpkg rather than performing an exhaustive scan of all repos/overlays.
+
+	Better_cache.items data may look like this::
+
+	  { "sys-apps/portage" : [ repo1, repo2 ] }
+
+	Without better_cache, Portage will get slower and slower (due to excessive IO) as more overlays are added.
+
+	Also note that it is OK if this cache has some 'false positive' catpkgs in it. We use it to search for specific
+	catpkgs listed in ebuilds. The likelihood of a false positive catpkg in our cache causing a problem is extremely
+	low, because the user of our cache is passing us a catpkg that came from somewhere and has already undergone some
+	validation, and even then will further interrogate the short-list of repos we return to gather more information
+	on the catpkg.
+
+	Thus, the code below is optimized for speed rather than painstaking correctness. I have added a note to
+	``dbapi.getRepositories()`` to ensure that developers are aware of this just in case.
+
+	The better_cache has been redesigned to perform on-demand scans -- it will only scan a category at a time, as
+	needed. This should further optimize IO performance by not scanning category directories that are not needed by
+	Portage.
+	"""
+
+	def __init__(self, repositories):
+		self._items = collections.defaultdict(list)
+		self._scanned_cats = set()
+
+		# ordered list of all portree locations we'll scan:
+		self._repo_list = [repo for repo in reversed(list(repositories))
+			if repo.location is not None]
+
+	def __getitem__(self, catpkg):
+		result = self._items.get(catpkg)
+		if result is not None:
+			return result
+
+		cat, pkg = catsplit(catpkg)
+		if cat not in self._scanned_cats:
+			self._scan_cat(cat)
+		return self._items[catpkg]
+
+	def _scan_cat(self, cat):
+		for repo in self._repo_list:
+			cat_dir = repo.location + "/" + cat
+			try:
+				pkg_list = os.listdir(cat_dir)
+			except OSError as e:
+				if e.errno not in (errno.ENOTDIR, errno.ENOENT, errno.ESTALE):
+					raise
+				continue
+			for p in pkg_list:
+				if os.path.isdir(cat_dir + "/" + p):
+					self._items[cat + "/" + p].append(repo)
+		self._scanned_cats.add(cat)
+
+
 class portdbapi(dbapi):
 	"""this tree will scan a portage directory located at root (passed to init)"""
 	portdbapi_instances = _dummy_list()
@@ -346,11 +408,14 @@ class portdbapi(dbapi):
 			return None
 
 	def getRepositories(self, catpkg=None):
+
 		"""
 		With catpkg=None, this will return a complete list of repositories in this dbapi. With catpkg set to a value,
 		this method will return a short-list of repositories that contain this catpkg. Use this second approach if
 		possible, to avoid exhaustively searching all repos for a particular catpkg. It's faster for this method to
-		find the catpkg than for you do it yourself.
+		find the catpkg than for you do it yourself. When specifying catpkg, you should have reasonable assurance that
+		the category is valid and PMS-compliant as the caching mechanism we use does not perform validation checks for
+		categories.
 
 		This function is required for GLEP 42 compliance.
 
@@ -358,7 +423,8 @@ class portdbapi(dbapi):
 		  catpkg; if None, return a list of all Repositories that contain a particular catpkg.
 		@return: a list of repositories.
 		"""
-		if catpkg is not None and self._better_cache is not None and catpkg in self._better_cache:
+
+		if catpkg is not None and self._better_cache is not None:
 			return [repo.name for repo in self._better_cache[catpkg]]
 		return self._ordered_repo_name_list
 
@@ -796,12 +862,7 @@ class portdbapi(dbapi):
 		elif self._better_cache is None:
 			mytrees = self.porttrees
 		else:
-			try:
-				repos = self._better_cache[mycp]
-			except KeyError:
-				mytrees = []
-			else:
-				mytrees = [repo.location for repo in repos]
+			mytrees = [repo.location for repo in self._better_cache[mycp]]
 		for oroot in mytrees:
 			try:
 				file_list = os.listdir(os.path.join(oroot, mycp))
@@ -850,50 +911,7 @@ class portdbapi(dbapi):
 			"minimum-all-ignore-profile", "minimum-visible"):
 			self.xcache[x]={}
 		self.frozen=1
-		self._better_cache = better_cache = collections.defaultdict(list)
-
-		# The purpose of self._better_cache is to perform an initial quick scan of all repositories
-		# using os.listdir(), which is less expensive IO-wise than exhaustively doing a stat on each
-		# repo. self._better_cache stores a list of repos in which particular catpkgs appear.
-		#
-		# For example, better_cache data may look like this:
-		#
-		# { "sys-apps/portage" : [ repo1, repo2 ] }
-		#
-		# Without this tweak, Portage will get slower and slower as more overlays are added.
-		#
-		# Also note that it is OK if this cache has some 'false positive' catpkgs in it. We use it
-		# to search for specific catpkgs listed in ebuilds. The likelihood of a false positive catpkg
-		# in our cache causing a problem is extremely low. Thus, the code below is optimized for
-		# speed rather than painstaking correctness.
-
-		valid_categories = self.settings.categories
-		for repo_loc in reversed(self.porttrees):
-			repo = self.repositories.get_repo_for_location(repo_loc)
-			try:
-				categories = os.listdir(repo_loc)
-			except OSError as e:
-				if e.errno not in (errno.ENOTDIR, errno.ENOENT, errno.ESTALE):
-					raise
-				continue
-
-			for cat in categories:
-				if cat not in valid_categories:
-					continue
-				cat_dir = repo_loc + "/" + cat
-				try:
-					pkg_list = os.listdir(cat_dir)
-				except OSError as e:
-					if e.errno != errno.ENOTDIR:
-						raise
-					continue
-
-				for p in pkg_list:
-					catpkg_dir = cat_dir + "/" + p
-					if not os.path.isdir(catpkg_dir):
-						continue
-					catpkg = cat + "/" + p
-					better_cache[catpkg].append(repo)
+		self._better_cache = _better_cache(self.repositories)
 
 	def melt(self):
 		self.xcache = {}


             reply	other threads:[~2017-10-15  0:59 UTC|newest]

Thread overview: 288+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-15  0:59 Zac Medico [this message]
  -- strict thread matches above, loose matches on Subject: below --
2018-07-16  5:27 [gentoo-commits] proj/portage:master commit in: pym/portage/dbapi/ Zac Medico
2018-07-15 21:40 Zac Medico
2018-07-14 23:58 Zac Medico
2018-06-26 20:24 Zac Medico
2018-04-30  6:29 Zac Medico
2018-04-29 21:29 Zac Medico
2018-04-24  6:56 Zac Medico
2018-04-23 19:53 Zac Medico
2018-04-23 18:52 Zac Medico
2018-04-17 17:18 Zac Medico
2018-04-13  1:23 Zac Medico
2018-04-11  1:47 Zac Medico
2018-04-10 22:17 Zac Medico
2018-04-10  1:33 Zac Medico
2018-03-16 21:11 Michał Górny
2018-02-28 18:41 Zac Medico
2017-11-21 19:27 Zac Medico
2017-11-21  1:34 Zac Medico
2017-11-14  3:15 Zac Medico
2017-08-27  9:33 Fabian Groffen
2017-02-28 22:07 Michał Górny
2017-02-28 22:07 Michał Górny
2016-11-10 20:25 Zac Medico
2016-06-19 22:17 Zac Medico
2016-06-02 15:48 Zac Medico
2016-03-05  8:21 Zac Medico
2016-01-24 10:33 Zac Medico
2015-09-24 20:30 Zac Medico
2015-05-16 19:14 Zac Medico
2015-05-09  7:33 Zac Medico
2015-05-04  5:15 Zac Medico
2015-05-02 23:11 Zac Medico
2015-04-29  4:20 Zac Medico
2015-04-06  5:07 Zac Medico
2015-04-01 19:16 Zac Medico
2015-03-04 21:37 Zac Medico
2015-03-04 21:37 Zac Medico
2015-03-04 21:37 Zac Medico
2015-02-17 18:31 Zac Medico
2014-12-17 22:13 Zac Medico
2014-12-15 16:28 Arfrever Frehtes Taifersar Arahesis
2014-12-07 23:14 Zac Medico
2014-12-07 23:14 Zac Medico
2014-12-07 23:14 Zac Medico
2014-12-07  0:58 Zac Medico
2014-11-18  3:00 Zac Medico
2014-11-02 18:22 Zac Medico
2014-09-11 23:37 Zac Medico
2014-04-04 23:01 Brian Dolbec
2014-02-24  0:53 Alexander Berntsen
2014-02-07 22:35 Chris Reffett
2014-01-19  8:16 Arfrever Frehtes Taifersar Arahesis
2013-12-05 15:38 Brian Dolbec
2013-09-17 23:32 Zac Medico
2013-08-12  3:23 Zac Medico
2013-08-04 22:09 Zac Medico
2013-07-07  3:23 Zac Medico
2013-06-08  5:48 Zac Medico
2013-06-08  3:00 Zac Medico
2013-06-08  2:49 Zac Medico
2013-06-07 20:52 Arfrever Frehtes Taifersar Arahesis
2013-06-02 22:48 Zac Medico
2013-05-17  2:06 Zac Medico
2013-05-15  7:40 Zac Medico
2013-05-07  3:49 Zac Medico
2013-05-06  8:18 Zac Medico
2013-04-12 17:23 Zac Medico
2013-04-01 17:28 Zac Medico
2013-03-16  5:39 Zac Medico
2013-03-16  5:30 Zac Medico
2013-03-16  1:46 Zac Medico
2013-01-28 21:19 Zac Medico
2013-01-24 18:29 Zac Medico
2013-01-17 14:20 Zac Medico
2013-01-10 14:06 Zac Medico
2013-01-06 20:50 Zac Medico
2013-01-03 22:42 Zac Medico
2012-12-30  9:36 Zac Medico
2012-12-28 22:45 Zac Medico
2012-12-27  3:10 Zac Medico
2012-12-24 21:12 Zac Medico
2012-12-24 21:06 Zac Medico
2012-12-24 20:21 Zac Medico
2012-12-10  8:55 Zac Medico
2012-12-10  8:39 Zac Medico
2012-11-26  3:27 Zac Medico
2012-11-15  0:03 Zac Medico
2012-10-30 23:11 Zac Medico
2012-10-25 19:49 Zac Medico
2012-10-25 15:42 Zac Medico
2012-10-25 15:33 Zac Medico
2012-10-25 15:31 Zac Medico
2012-10-25  9:35 Zac Medico
2012-10-25  9:32 Zac Medico
2012-10-25  8:23 Zac Medico
2012-10-25  6:59 Zac Medico
2012-10-18  3:32 Zac Medico
2012-10-18  0:48 Zac Medico
2012-10-09  2:02 Zac Medico
2012-10-07 22:02 Zac Medico
2012-09-26  3:47 Zac Medico
2012-09-16 21:16 Zac Medico
2012-09-16 21:01 Zac Medico
2012-08-08 20:52 Zac Medico
2012-08-08 20:34 Zac Medico
2012-08-02 20:22 Zac Medico
2012-08-02  0:57 Zac Medico
2012-07-31 23:10 Arfrever Frehtes Taifersar Arahesis
2012-07-12 20:58 Zac Medico
2012-07-05  3:28 Zac Medico
2012-07-02 22:27 Zac Medico
2012-07-02 21:41 Zac Medico
2012-07-01 20:07 Zac Medico
2012-07-01  8:11 Zac Medico
2012-06-24 21:01 Zac Medico
2012-06-24 19:16 Zac Medico
2012-06-23 20:39 Zac Medico
2012-06-23  1:14 Zac Medico
2012-06-15 23:43 Zac Medico
2012-06-12  6:41 Zac Medico
2012-05-17 20:08 Zac Medico
2012-05-14 23:56 Zac Medico
2012-05-14  0:46 Zac Medico
2012-05-14  0:29 Zac Medico
2012-05-09 22:29 Zac Medico
2012-05-05 16:54 Zac Medico
2012-04-22 20:34 Zac Medico
2012-04-18  1:48 Zac Medico
2012-03-23 17:28 Zac Medico
2012-03-18 22:40 Zac Medico
2012-02-23  5:31 Zac Medico
2012-02-23  3:07 Zac Medico
2012-02-15 22:28 Zac Medico
2012-02-15 22:17 Zac Medico
2012-02-11 18:46 Zac Medico
2012-02-09  5:17 Zac Medico
2012-02-09  5:05 Zac Medico
2012-01-27 22:02 Zac Medico
2012-01-11 16:02 Zac Medico
2011-12-22 19:43 Zac Medico
2011-12-22  0:32 Zac Medico
2011-12-22  0:29 Zac Medico
2011-12-21 23:16 Zac Medico
2011-12-21 22:58 Zac Medico
2011-12-20 20:27 Zac Medico
2011-12-10 19:41 Zac Medico
2011-12-10  7:04 Zac Medico
2011-12-01 20:34 Zac Medico
2011-11-27 21:00 Zac Medico
2011-11-26  1:54 Arfrever Frehtes Taifersar Arahesis
2011-10-30  0:00 Zac Medico
2011-10-29  5:35 Zac Medico
2011-10-29  4:17 Zac Medico
2011-10-28 18:03 Zac Medico
2011-10-25 14:52 Zac Medico
2011-10-19 21:19 Zac Medico
2011-10-17  5:21 Zac Medico
2011-10-16  5:59 Zac Medico
2011-10-15 20:29 Zac Medico
2011-10-14  0:24 Zac Medico
2011-10-05  4:27 Zac Medico
2011-10-02 20:02 Zac Medico
2011-09-27 16:46 Zac Medico
2011-09-27 14:58 Zac Medico
2011-09-24 22:24 Zac Medico
2011-09-24 22:03 Zac Medico
2011-09-24 21:44 Zac Medico
2011-09-24 21:32 Zac Medico
2011-09-24 21:18 Zac Medico
2011-09-24 20:47 Zac Medico
2011-09-24 20:15 Zac Medico
2011-09-24 19:50 Zac Medico
2011-09-24 19:27 Zac Medico
2011-09-24 19:05 Zac Medico
2011-09-24 18:29 Zac Medico
2011-09-24 18:13 Zac Medico
2011-09-23 20:50 Zac Medico
2011-09-23 20:01 Zac Medico
2011-09-23 19:53 Zac Medico
2011-09-23 18:19 Zac Medico
2011-09-23  1:48 Zac Medico
2011-09-23  0:53 Zac Medico
2011-09-07 17:36 Zac Medico
2011-09-02 15:08 Zac Medico
2011-09-01  8:43 Zac Medico
2011-09-01  6:46 Zac Medico
2011-08-30  2:09 Zac Medico
2011-08-06 11:10 Zac Medico
2011-08-04 23:09 Zac Medico
2011-08-02  8:56 Zac Medico
2011-08-01 12:32 Zac Medico
2011-07-29 20:27 Zac Medico
2011-07-29  7:22 Zac Medico
2011-07-28 11:29 Zac Medico
2011-07-27  1:51 Zac Medico
2011-07-23 18:47 Zac Medico
2011-07-12 19:50 Zac Medico
2011-07-08  6:46 Zac Medico
2011-07-01  4:02 Zac Medico
2011-06-29 11:37 Zac Medico
2011-06-17 20:57 Zac Medico
2011-06-13 22:14 Zac Medico
2011-06-11  3:41 Zac Medico
2011-06-10 17:20 Zac Medico
2011-06-10 12:21 Zac Medico
2011-06-10 12:04 Zac Medico
2011-06-06 11:52 Zac Medico
2011-06-05  9:18 Zac Medico
2011-06-05  9:01 Zac Medico
2011-06-05  8:55 Zac Medico
2011-06-05  8:14 Zac Medico
2011-06-05  7:23 Zac Medico
2011-06-05  7:10 Zac Medico
2011-06-04  2:13 Zac Medico
2011-06-03 11:20 Zac Medico
2011-06-03 10:16 Zac Medico
2011-06-03  2:38 Zac Medico
2011-05-27  2:38 Zac Medico
2011-05-27  2:15 Zac Medico
2011-05-27  0:01 Zac Medico
2011-05-26  2:34 Zac Medico
2011-05-26  1:39 Zac Medico
2011-05-26  0:57 Zac Medico
2011-05-25  5:10 Zac Medico
2011-05-25  5:02 Zac Medico
2011-05-25  4:37 Zac Medico
2011-05-25  4:08 Zac Medico
2011-05-24 10:40 Zac Medico
2011-05-21 14:23 Zac Medico
2011-05-21 13:15 Zac Medico
2011-05-18  5:38 Zac Medico
2011-05-18  5:27 Zac Medico
2011-05-16  1:26 Zac Medico
2011-05-16  0:22 Zac Medico
2011-05-15 23:20 Zac Medico
2011-05-15  9:32 Zac Medico
2011-05-15  4:32 Zac Medico
2011-05-15  2:55 Zac Medico
2011-05-15  2:44 Zac Medico
2011-05-14 23:55 Zac Medico
2011-05-14 22:56 Arfrever Frehtes Taifersar Arahesis
2011-05-14 22:14 Zac Medico
2011-05-14 21:41 Zac Medico
2011-05-14 21:25 Zac Medico
2011-05-12 19:02 Zac Medico
2011-05-12 17:51 Zac Medico
2011-05-12  7:09 Zac Medico
2011-05-12  5:35 Zac Medico
2011-05-12  1:16 Zac Medico
2011-05-10 19:53 Zac Medico
2011-05-10  5:05 Zac Medico
2011-05-10  4:47 Zac Medico
2011-05-09 20:36 Zac Medico
2011-05-09  6:25 Zac Medico
2011-05-09  4:52 Zac Medico
2011-05-08 20:46 Zac Medico
2011-05-08 20:19 Zac Medico
2011-05-08 19:35 Zac Medico
2011-05-08  7:30 Zac Medico
2011-05-08  4:53 Zac Medico
2011-05-08  4:53 Zac Medico
2011-05-03  3:02 Zac Medico
2011-03-27 22:38 Zac Medico
2011-03-27 21:26 Zac Medico
2011-03-27 20:57 Zac Medico
2011-03-27  6:09 Zac Medico
2011-03-26 23:26 Zac Medico
2011-03-26 18:01 Zac Medico
2011-03-26  8:23 Zac Medico
2011-03-26  2:43 Zac Medico
2011-03-26  2:43 Zac Medico
2011-03-25 20:53 Zac Medico
2011-03-25 10:39 Zac Medico
2011-03-25 10:13 Zac Medico
2011-03-25  9:47 Zac Medico
2011-03-25  8:25 Zac Medico
2011-03-25  8:12 Zac Medico
2011-03-16 20:58 Zac Medico
2011-03-07 17:34 Zac Medico
2011-03-02 19:49 Zac Medico
2011-03-02 18:54 Zac Medico
2011-03-01 21:23 Zac Medico
2011-03-01 21:05 Zac Medico
2011-03-01 20:43 Zac Medico
2011-03-01 20:08 Zac Medico
2011-03-01 18:07 Zac Medico
2011-02-14  4:02 Zac Medico

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1508028820.c5a2a0edc4f4b01b16a274268431fa21f7f678b2.zmedico@gentoo \
    --to=zmedico@gentoo.org \
    --cc=gentoo-commits@lists.gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox