|
|
Created:
11 years, 11 months ago by wallyworld Modified:
11 years, 10 months ago Visibility:
Public. |
DescriptionInitial simple streams support
This branch adds a simple streams package to allow image metadata to be retrieved
from json encoded metadata. The process starts by reading an index file from a known
URL. The URL will be a static URL for EC2 (http://cloud-images.ubuntu.com/releases),
and for openstack can be published in the keystone catalog or provided from an env
config. This latter work to plug in the URLs from the providers will be done separately.
This branch is just the core capability.
The data model used is a single implementation for all providers (so far EC2 and Openstack).
Various providers use many different metadata attributes but these are ignored - only the common
ones like id, region, arch, vtype etc are required to be read and used for filtering, and only
the id is ultimately required by the calling business logic.
The tests are written to use example data to check the various processing scenarios, but
also can be run live against canonistack or EC2. The live tests are a subset of the local
tests and check that basic parsing and image lookup works with real data from two providers
with significant differences in the exact data attributes used. The canonistack metadata has
been published in a known public bucket.
https://code.launchpad.net/~wallyworld/juju-core/oxygen-data-parse/+merge/162308
(do not edit description out of merge proposal)
Patch Set 1 #
Total comments: 56
Patch Set 2 : Initial simple streams support #
Total comments: 7
Patch Set 3 : Initial simple streams support #Patch Set 4 : Initial simple streams support #Patch Set 5 : Initial simple streams support #
Total comments: 28
Patch Set 6 : Initial simple streams support #
MessagesTotal messages: 16
Please take a look.
Sign in to reply to this message.
Nice work deciphering a fairly abstruse format. Generally pretty good, but could use some work. This does not lgtm until the below issues are resolved. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... File environs/simplestreams/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:1: // Support for locating, parsing, and filtering Ubuntu image metadata in simplestreams format. // The simplestream package supports ... https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:3: d (then it becomes a doc comment) https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:4: package simplestreams This is not a great name. For a start, despite the launchpad project name, what we're dealing with here are neither simple nor streams. Also, this package is specialised for image metadata. A "simplestreams" package would be much more general than that. You've called the branch "oxygen-data-parse". Might "oxygen" be a better name for the package? (I'm afraid I'm not familiar with all the terminology). https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:37: // Generates a string representing a product id in a known format. The format is known to you perhaps, but definitely worth mentioning. // String returns a product id in the following format: with perhaps a reference to where the convention comes from. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:46: panic(fmt.Errorf("Invalid Ubuntu release %q", ps.Release)) is this really worth it? it means we need to maintain the list of release versions above, and it adds one more possible panic, when it doesn't really seem like a panic is necessarily justified. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:53: type ImageMetadata struct { i find these type names quite confusing. "ImageMetadata" vs "CloudImagedata" for example - there's no indication in the name of the very different roles of these types. I think I'd find it easier to read if the types read hierarchically from the top down rather than the other way around (i.e. start with CloudImageMetadata and end with ImageMetadata). It would also help if each type had a comment describing the role of the type and what it holds. That goes doubly for the Aliases field with its rather complex semantics. Despite the name "simple streams", the data format here is really not very simple at all. Given that there appears to be no other resource that documents it decently, I think we should do so here to save other people going through the same reverse-engineering process. Also, I think it might be better to unexport all but the types that are returned by GetImageIdMetadata. Then the package documentation will be much clearer, I think. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:84: Aliases map[string]AliasesByAttribute `json:"_aliases"` please document the semantics of this field. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:118: func GetDefaultImageIdMetadata(baseURL string, cloudSpec *CloudSpec, prodSpec *ProductSpec) ([]*ImageMetadata, error) { I'm not sure this function justifies itself in convenience over just having GetImageIdMetadata. How about just exporting DefaultIndexPath? https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:124: func GetImageIdMetadata(baseURL, indexPath string, cloudSpec *CloudSpec, prodSpec *ProductSpec) ([]*ImageMetadata, error) { Having gone to all the trouble of downloading the index data, it seems a pity to throw it away. How about something like this? func FetchIndex(baseURL, indexPath string) (*Index, error) type Index struct { // Unexported fields } func (idx *Index) GetImages(cloudSpec CloudSpec, prodSpec ProdSpec) ([]*ImageMetadata, error) (Note the fact that I changed *CloudSpec to CloudSpec above. I think that's reasonable given that there's no nil default) BTW is there a decent reason to separate the CloudSpec from ProductSpec (apart from the fact that they're mentioned in different parts of the metadata)? ISTM that the API might be slightly tidier if we had a single exported Spec type. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:132: // Helper function to fetch data from a given http location. // fetchData fetches all the data from the // given path relative to the given base URL. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:145: return nil, fmt.Errorf("invalid URL %s, %s", dataURL, resp.Status) this error is not accurate. the URL might well be valid, but there might be something up with a gateway or something else. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:158: return nil, fmt.Errorf("cannot read index file, %s", err.Error()) fmt.Errorf("cannot read index data: %v", err) (talking about a file makes it sounds like we're reading something from the local file system) https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:201: tmp := m why not just: return m.ProductsFilePath, nil and lose the metadata variable? https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:218: func extractAttrMap(metadataStruct interface{}) map[string]string { this seems a bit unnecessarily heavy. see below for a possible alternative. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:236: func (metadata *CloudImageMetadata) denormaliseImageMetadata() { I found this code really quite hard to understand; moreover there is a subtle bug here. At first I wondered why we didn't just do something ultra simple like: func imageInherit(img *ImageMetadata, coll *ImageCollection, cat *ImageMetadataCatalog) { img.RegionName = firstNonEmpty(img.RegionName, coll.RegionName, cat.RegionName) img.Endpoint = firstNonEmpty(img.Endpoint, coll.Endpoint, cat.Endpoint) } but then I saw the alias semantics and I understood a little bit more. I'd be happier if the reflect/json tag code wasn't spread around the place so much. I played around with it a bit trying to come up with something nicer in that respect. How about something along these kinds of lines? http://paste.ubuntu.com/5629110/ https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:249: imageCollection.Images[id] = im this logic would be simpler if you stored the various structs as pointers not values. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:286: applyAttributeValues(im, attrAliases, true) is that last "true" correct? should aliases override attributes defined specifically in the image metadata itself? https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:310: if !containsImage(matchingImages, &im) { I'm not sure this n^2 algorithm is justified. Wouldn't it be almost as easy to build up a map[key] *ImageMetadata ? where type key struct { vtype string storage string } https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:326: // Load the entire cloud image metadata encoded using the specified format. // getCloudMetadataWithFormat loads the entire cloud image // metadata that must be encoded with the specified format. I think the denormalisation comment is just an implementation detail here. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:335: return nil, fmt.Errorf("cannot read product file, %s", err.Error()) fmt.Errorf("cannot read product data: %v", err) https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:338: if len(data) > 0 { why bother with this check? The "cannot unmarshal" error would seem a more appropriate error message than "expected index file format" https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:345: return nil, fmt.Errorf("expected index file format %s, got %s", format, imageMetadata.Format) fmt.Errorf("unexpected index format %q; expected %q", imageMetadata.Format, format) ? https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:352: // Load the image metadata for the given cloud and order the images starting with the most recent. // getLatestImageIdMetadataWithFormat loads the ... https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:373: sort.Sort(bv) I'm not sure it's worth sorting the whole thing here. Can't we just replace elements in a map if the versions are greater? (see the n^2 algorithm comment above) https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... File environs/simplestreams/simplestreams_test.go (right): https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams_test.go:354: c.Assert(len(ic.Images) > 0, Equals, true) im := ic.Images["usww2hw"] c.Check etc that way the test will fail even if there is no key "usww2he". https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams_test.go:381: ic := metadataCatalog.Images["20121111"] this test is not testing what you think it is. there is no 20121111 image in the data. see comment above for a less fragile way to phrase the test. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams_test.go:396: if key != "usww3he" { ditto
Sign in to reply to this message.
https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... File environs/simplestreams/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:4: package simplestreams On 2013/05/03 14:58:08, rog wrote: > This is not a great name. For a start, despite the launchpad project name, what > we're dealing with here are neither simple nor streams. I'd argue they are streams. The data describes 'product' streams, ie "sources of like things that get refreshed". > You've called the branch "oxygen-data-parse". Might "oxygen" be a better name > for the package? (I'm afraid I'm not familiar with all the terminology). I'd avoid using the word oxygen. that was a codename that encompassed much more than just reading image-ids. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:4: package simplestreams On 2013/05/03 14:58:08, rog wrote: > This is not a great name. For a start, despite the launchpad project name, what > we're dealing with here are neither simple nor streams. > > Also, this package is specialised for image metadata. A "simplestreams" package > would be much more general than that. > > You've called the branch "oxygen-data-parse". Might "oxygen" be a better name > for the package? (I'm afraid I'm not familiar with all the terminology). > Please dont use the word oxygen, as that was a codename for something larger than this. Perhaps 'product-streams' or 'product-image-streams' ? https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:46: panic(fmt.Errorf("Invalid Ubuntu release %q", ps.Release)) On 2013/05/03 14:58:08, rog wrote: > is this really worth it? it means we need to maintain the list of release > versions above, and it adds one more possible panic, when it doesn't really seem > like a panic is necessarily justified. This is fallout of the data format. you dont have to be able to generate the "release-name" -> "product name" map. You *could* just go digging through for it. However, the index file lists the products that are contained in it. so if you know that you can avoid reading lots un-related items and looking at their tags for 'release: precise', 'arch: amd64'.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... File environs/simplestreams/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:1: // Support for locating, parsing, and filtering Ubuntu image metadata in simplestreams format. On 2013/05/03 14:58:08, rog wrote: > // The simplestream package supports ... Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:3: On 2013/05/03 14:58:08, rog wrote: > d (then it becomes a doc comment) Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:4: package simplestreams On 2013/05/03 14:58:08, rog wrote: > This is not a great name. For a start, despite the launchpad project name, what > we're dealing with here are neither simple nor streams. > > Also, this package is specialised for image metadata. A "simplestreams" package > would be much more general than that. > > You've called the branch "oxygen-data-parse". Might "oxygen" be a better name > for the package? (I'm afraid I'm not familiar with all the terminology). > I've renamed the package "imagemetadata" https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:37: // Generates a string representing a product id in a known format. On 2013/05/03 14:58:08, rog wrote: > The format is known to you perhaps, but definitely worth mentioning. > > // String returns a product id in the following format: > > with perhaps a reference to where the convention comes from. The main doc for simplestreams is in the doc/README file for the project which I've now mentioned in the package doc string. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:46: panic(fmt.Errorf("Invalid Ubuntu release %q", ps.Release)) On 2013/05/03 14:58:08, rog wrote: > is this really worth it? it means we need to maintain the list of release > versions above, and it adds one more possible panic, when it doesn't really seem > like a panic is necessarily justified. There's no easy way to get the mapping from release->version_num and there's only ever going to need to be a new version added every 6 months. So the releaseVersions map is essentially static for all intents and purposes. I think the approach I've used is reasonable in this case. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:53: type ImageMetadata struct { On 2013/05/03 14:58:08, rog wrote: > i find these type names quite confusing. > "ImageMetadata" vs "CloudImagedata" for example - there's no indication in the > name of the very different roles of these types. > > I think I'd find it easier to read if the types read hierarchically from the top > down rather than the other way around (i.e. start with CloudImageMetadata and > end with ImageMetadata). > > It would also help if each type had a comment describing the role of the type > and what it holds. That goes doubly for the Aliases field with its rather > complex semantics. > > Despite the name "simple streams", the data format here is really not very > simple at all. Given that there appears to be no other resource that documents > it decently, I think we should do so here to save other people going through the > same reverse-engineering process. > > Also, I think it might be better to unexport all but the types that are returned > by GetImageIdMetadata. Then the package documentation will be much clearer, I > think. I've changed the order of the structs and unexported the structs. I really don't want to include too much documentation here though since it's all documented in the doc/README file in the simplestreams project and I've mentioned this in the package doc string. Trying to maintain doc in two places would not be fun. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:84: Aliases map[string]AliasesByAttribute `json:"_aliases"` On 2013/05/03 14:58:08, rog wrote: > please document the semantics of this field. See previous comments - the lp:simplestreams project contains the doc. I also added a couple of extra lines of doc up where the aliasesByAttribute struct was defined https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:118: func GetDefaultImageIdMetadata(baseURL string, cloudSpec *CloudSpec, prodSpec *ProductSpec) ([]*ImageMetadata, error) { On 2013/05/03 14:58:08, rog wrote: > I'm not sure this function justifies itself in convenience over just having > GetImageIdMetadata. > > How about just exporting DefaultIndexPath? Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:124: func GetImageIdMetadata(baseURL, indexPath string, cloudSpec *CloudSpec, prodSpec *ProductSpec) ([]*ImageMetadata, error) { On 2013/05/03 14:58:08, rog wrote: > Having gone to all the trouble of downloading the index data, it seems a pity to > throw it away. > Sorry, I don't understand this comment. The index data is not discarded. getIndexWithFormat fetches it into an indexRef object and then it is used when indexRef.getLatestImageIdMetadataWithFormat is called. Beyond that, it is not required. > > BTW is there a decent reason to separate the CloudSpec from ProductSpec (apart > from the fact that they're mentioned in different parts of the metadata)? ISTM > that the API might be slightly tidier if we had a single exported Spec type. Semantically they are separate concepts. I guess I could do an ImageSpec struct but I'm still leaning towards wanting to keep them separate since that better reflects how I think users would think about this ie "for the cloud defined in this CloudSpec, find me image metadata matching the info in this ProductSpec". https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:132: // Helper function to fetch data from a given http location. On 2013/05/03 14:58:08, rog wrote: > // fetchData fetches all the data from the > // given path relative to the given base URL. Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:145: return nil, fmt.Errorf("invalid URL %s, %s", dataURL, resp.Status) On 2013/05/03 14:58:08, rog wrote: > this error is not accurate. the URL might well be valid, but there might be > something up with a gateway or something else. Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:158: return nil, fmt.Errorf("cannot read index file, %s", err.Error()) On 2013/05/03 14:58:08, rog wrote: > fmt.Errorf("cannot read index data: %v", err) > > (talking about a file makes it sounds like we're reading something from the > local file system) Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:201: tmp := m On 2013/05/03 14:58:08, rog wrote: > why not just: > return m.ProductsFilePath, nil > > and lose the metadata variable? Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:236: func (metadata *CloudImageMetadata) denormaliseImageMetadata() { On 2013/05/03 14:58:08, rog wrote: > I found this code really quite hard to understand; moreover there is a subtle > bug here. > > At first I wondered why we didn't just do something ultra simple like: > > func imageInherit(img *ImageMetadata, coll *ImageCollection, cat > *ImageMetadataCatalog) { > img.RegionName = firstNonEmpty(img.RegionName, coll.RegionName, cat.RegionName) > img.Endpoint = firstNonEmpty(img.Endpoint, coll.Endpoint, cat.Endpoint) > } > > but then I saw the alias semantics and I understood a little bit more. > > I'd be happier if the reflect/json tag code wasn't spread around the place so > much. > > I played around with it a bit trying to come up with something nicer in that > respect. How about something along these kinds of lines? > > http://paste.ubuntu.com/5629110/ > I looked at the suggested solution and it adds about 100 extra lines of code which I don't see the value in doing. How about I just add a bit more code comments to what's there - it's really not so bad once the reasoning behind the simplestreams stuff is understood I don't think https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:249: imageCollection.Images[id] = im On 2013/05/03 14:58:08, rog wrote: > this logic would be simpler if you stored the various structs as pointers not > values. Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:286: applyAttributeValues(im, attrAliases, true) On 2013/05/03 14:58:08, rog wrote: > is that last "true" correct? should aliases override attributes defined > specifically in the image metadata itself? I believe so based on my understanding of the format https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:310: if !containsImage(matchingImages, &im) { On 2013/05/03 14:58:08, rog wrote: > I'm not sure this n^2 algorithm is justified. > Wouldn't it be almost as easy to build up a map[key] *ImageMetadata ? > where > type key struct { > vtype string > storage string > } Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:326: // Load the entire cloud image metadata encoded using the specified format. On 2013/05/03 14:58:08, rog wrote: > // getCloudMetadataWithFormat loads the entire cloud image > // metadata that must be encoded with the specified format. > > I think the denormalisation comment is just an implementation detail here. Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:335: return nil, fmt.Errorf("cannot read product file, %s", err.Error()) On 2013/05/03 14:58:08, rog wrote: > fmt.Errorf("cannot read product data: %v", err) Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:338: if len(data) > 0 { On 2013/05/03 14:58:08, rog wrote: > why bother with this check? The "cannot unmarshal" error would seem a more > appropriate error message than "expected index file format" Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:345: return nil, fmt.Errorf("expected index file format %s, got %s", format, imageMetadata.Format) On 2013/05/03 14:58:08, rog wrote: > fmt.Errorf("unexpected index format %q; expected %q", imageMetadata.Format, > format) > ? Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:352: // Load the image metadata for the given cloud and order the images starting with the most recent. On 2013/05/03 14:58:08, rog wrote: > // getLatestImageIdMetadataWithFormat loads the ... Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams.go:373: sort.Sort(bv) On 2013/05/03 14:58:08, rog wrote: > I'm not sure it's worth sorting the whole thing here. Can't we just replace > elements in a map if the versions are greater? (see the n^2 algorithm comment > above) The version is not part of the image metadata struct, although I could maybe add it. There really aren't that many versions though so it's really not worth the extra code complexity. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... File environs/simplestreams/simplestreams_test.go (right): https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams_test.go:354: c.Assert(len(ic.Images) > 0, Equals, true) On 2013/05/03 14:58:08, rog wrote: > im := ic.Images["usww2hw"] > c.Check etc > > that way the test will fail even if there is no key "usww2he". Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams_test.go:381: ic := metadataCatalog.Images["20121111"] On 2013/05/03 14:58:08, rog wrote: > this test is not testing what you think it is. there is no 20121111 image in the > data. > > see comment above for a less fragile way to phrase the test. Done. https://codereview.appspot.com/9138044/diff/1/environs/simplestreams/simplest... environs/simplestreams/simplestreams_test.go:396: if key != "usww3he" { On 2013/05/03 14:58:08, rog wrote: > ditto Done.
Sign in to reply to this message.
A couple of comments, I've not gone over everything. https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... File environs/imagemetadata/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... environs/imagemetadata/simplestreams.go:23: var releaseVersions = map[string]string{ Rather than adding another list of release names we'll need to keep updated, I'd be tempted to depend on the distro-info-data package then use /usr/share/distro-info/ubuntu.csv to populate this. https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... environs/imagemetadata/simplestreams.go:164: return nil, fmt.Errorf("cannot unmarshal JSON index metadata: %s", err.Error()) Including the URL included in this error would make debugging easier. https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... environs/imagemetadata/simplestreams.go:167: return nil, fmt.Errorf("unexpected index file format %q, expected %s", indices.Format, format) Likewise, including the URL here would be useful. https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... File environs/imagemetadata/simplestreams_test.go (right): https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... environs/imagemetadata/simplestreams_test.go:258: c.Assert(err, NotNil) Also checking the error messages are sane here and contain useful details, rather than just NotNil, would be nice.
Sign in to reply to this message.
Please take a look. https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... File environs/imagemetadata/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... environs/imagemetadata/simplestreams.go:23: var releaseVersions = map[string]string{ On 2013/05/07 17:33:54, gz wrote: > Rather than adding another list of release names we'll need to keep updated, I'd > be tempted to depend on the distro-info-data package then use > /usr/share/distro-info/ubuntu.csv to populate this. Done. https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... environs/imagemetadata/simplestreams.go:164: return nil, fmt.Errorf("cannot unmarshal JSON index metadata: %s", err.Error()) On 2013/05/07 17:33:54, gz wrote: > Including the URL included in this error would make debugging easier. Done. https://codereview.appspot.com/9138044/diff/7001/environs/imagemetadata/simpl... environs/imagemetadata/simplestreams.go:167: return nil, fmt.Errorf("unexpected index file format %q, expected %s", indices.Format, format) On 2013/05/07 17:33:54, gz wrote: > Likewise, including the URL here would be useful. Done.
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
Please take a look.
Sign in to reply to this message.
LGTM with the changes
Sign in to reply to this message.
LGTM with the below issues addressed. thanks! https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... File environs/imagemetadata/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:36: func NewImageConstraint(region, endpoint, release, arch, stream string) ImageConstraint { i'm not entirely convinced of the worth of this function - if i saw a call to it, i'd have to look up which argument corresponded to which attribute - a struct literal would be clearer. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:74: var releaseVersions = map[string]string{ since this is a mutable global, this should be guarded with a sync.Mutex. How about, rather than caching the image in the image constraint, you read the distro info once (I think it's not unreasonable to assume that /usr/share/distro-info doesn't change after the program has started) and just look up the info via the releaseVersions map. var ( releaseVersionsMutex sync.Mutex updatedReleaseVersions bool releaseVersions map[string]string ) func (id *ImageConstraint) Id() (s string, err error) { vers, err := releaseVersion(ic.Release) etc } func releaseVersion(release string) (string, error) releaseVersionsMutex.Lock() defer releaseVersionsMutex.Unlock() if vers, ok := releaseVersions[release]; ok { return vers, nil } if updatedReleaseVersions { return "", errNotFound } err := updateDistroInfo() updatedReleaseVersions = true if err != nil { return "", err } if vers, ok := releaseVersions[release]; ok { return vers, nil } return "", errNotFound } https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:102: // the numeric version may contain a LTS moniker so strip that out. if len(parts) < 3 { return fmt.Errorf("syntax error in ...") } (or just ignore lines with fewer than the right number of parts). we don't want to panic because the file isn't correctly formatted. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:145: // ImageMetadata records matching the supplied region, arch etc. this comment seems out of place. how about just: // ImageMetadata holds information about a particular cloud image. ? https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:184: // The index file location is as specified. s/specified./specified. The usual file location is DefaultIndexPath./ https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:185: func GetImageIdMetadata(baseURL, indexPath string, imageConstraint *ImageConstraint) ([]*ImageMetadata, error) { this name seems a little redundant now that the name of the packages is "imagemetadata" - imagemetadata.GetImageIdMetadata. how about just "Fetch" as a name? imagemetadata.Fetch(url, path, constraint) reads quite nicely to me. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:193: // fetchData gets all the data from the given path relative to the given base URL. // It returns the data found and the URL used. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:224: return nil, fmt.Errorf("cannot unmarshal JSON index metadata at URL %s: %s", url, err.Error()) return nil, fmt.Errorf("cannot unmarshal JSON index metadata at URL %q: %v", url, err) (no need to explicitly call Error, and the url should be quoted) https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:227: return nil, fmt.Errorf("unexpected index file format %q, expected %s at URL %s", indices.Format, format, url) s/%s/%q/g
Sign in to reply to this message.
Thanks for the review. I've implemented the suggestions but have a question about the NewImageConstraint() method remark. See my inline comment. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... File environs/imagemetadata/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:36: func NewImageConstraint(region, endpoint, release, arch, stream string) ImageConstraint { On 2013/05/15 17:06:26, rog wrote: > i'm not entirely convinced of the worth of this function - if i saw a call to > it, i'd have to look up which argument corresponded to which attribute - a > struct literal would be clearer. The reason for the function is because the returned ImageConstraint struct contains an embedded CloudSpec struct and I didn't want to expose this implementation detail. If I removed this function, then the user will need to type: ImageConstraint{ CloudSpec: CloudSpec{ Endpoint: endpoint, Region: region, }, Release: release, Arch: arch, Stream: stream, } The above seems kinda verbose and as I said, exposes the internals of ImageConstraint. Are you ok with those issues if I remove the function? https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:74: var releaseVersions = map[string]string{ On 2013/05/15 17:06:26, rog wrote: > since this is a mutable global, this should be guarded with a sync.Mutex. > > How about, rather than caching the image in the image constraint, you read the > distro info once (I think it's not unreasonable to assume that > /usr/share/distro-info doesn't change after the program has started) and just > look up the info via the releaseVersions map. > > var ( > releaseVersionsMutex sync.Mutex > updatedReleaseVersions bool > releaseVersions map[string]string > ) > > func (id *ImageConstraint) Id() (s string, err error) { > vers, err := releaseVersion(ic.Release) > etc > } > > func releaseVersion(release string) (string, error) > releaseVersionsMutex.Lock() > defer releaseVersionsMutex.Unlock() > if vers, ok := releaseVersions[release]; ok { > return vers, nil > } > if updatedReleaseVersions { > return "", errNotFound > } > err := updateDistroInfo() > updatedReleaseVersions = true > if err != nil { > return "", err > } > if vers, ok := releaseVersions[release]; ok { > return vers, nil > } > return "", errNotFound > } Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:102: // the numeric version may contain a LTS moniker so strip that out. On 2013/05/15 17:06:26, rog wrote: > if len(parts) < 3 { > return fmt.Errorf("syntax error in ...") > } > (or just ignore lines with fewer than the right number of parts). > > we don't want to panic because the file isn't correctly formatted. Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:145: // ImageMetadata records matching the supplied region, arch etc. On 2013/05/15 17:06:26, rog wrote: > this comment seems out of place. > how about just: > > // ImageMetadata holds information about a particular cloud image. > ? Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:184: // The index file location is as specified. On 2013/05/15 17:06:26, rog wrote: > s/specified./specified. The usual file location is DefaultIndexPath./ Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:193: // fetchData gets all the data from the given path relative to the given base URL. On 2013/05/15 17:06:26, rog wrote: > // It returns the data found and the URL used. Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:224: return nil, fmt.Errorf("cannot unmarshal JSON index metadata at URL %s: %s", url, err.Error()) On 2013/05/15 17:06:26, rog wrote: > return nil, fmt.Errorf("cannot unmarshal JSON index metadata at URL %q: %v", > url, err) > > (no need to explicitly call Error, and the url should be quoted) Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:227: return nil, fmt.Errorf("unexpected index file format %q, expected %s at URL %s", indices.Format, format, url) On 2013/05/15 17:06:26, rog wrote: > s/%s/%q/g Done.
Sign in to reply to this message.
LGTM https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... File environs/imagemetadata/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:87: // On non-Ubuntu systems this file won't exist butr that's expected. typo "butr" Should this only trap for NoExist? Given we don't really want to fail, I'm probably fine suppressing all errors, though. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:91: bufRdr := bufio.NewReader(f) Channeling my inner thumper, the name could be improved. :) https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... File environs/imagemetadata/simplestreams_test.go (right): https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams_test.go:60: var indexData = []jujutest.FileContent{ It seems odd that the two bits of content are indented differently. I would have expected either both to be left justified, or both indented. (I realize the actual JSON unmarshalling just ignores the extra whitespace.) Is there a reason to indent one but not the other?
Sign in to reply to this message.
sorry, but not LGTM without further discussion https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... File environs/imagemetadata/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:28: Release string Can we call this "Series" please, for consistency with the rest of the codebase? https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:29: Arch string This shouldn't be here, I think. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:43: Arch: arch, We cannot assume that arch is known at this stage. We need to get all arches (or possibly only those matching an []string (ie the arches we have tools available for)). https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:67: return "", fmt.Errorf("Invalid Ubuntu release %q", ic.Release) "invalid series %q"? https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:257: if pid == prodSpecId { I'm not sure this is correct, because of (1) arch and (2) our inability to determine for sure the release number corresponding to the desired name. Surely we should be getting every products file that is not *definitely* inappropriate, and filtering within those by arch and release (series)?
Sign in to reply to this message.
On 16 May 2013 01:24, <ian.booth@canonical.com> wrote: > Thanks for the review. I've implemented the suggestions but have a > question about the NewImageConstraint() method remark. See my inline > comment. > > > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > File environs/imagemetadata/simplestreams.go (right): > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:36: func > NewImageConstraint(region, endpoint, release, arch, stream string) > ImageConstraint { > On 2013/05/15 17:06:26, rog wrote: >> >> i'm not entirely convinced of the worth of this function - if i saw a > > call to >> >> it, i'd have to look up which argument corresponded to which attribute > > - a >> >> struct literal would be clearer. > > > The reason for the function is because the returned ImageConstraint > struct contains an embedded CloudSpec struct and I didn't want to expose > this implementation detail. If I removed this function, then the user > will need to type: > > ImageConstraint{ > CloudSpec: CloudSpec{ > Endpoint: endpoint, > Region: region, > }, > Release: release, > Arch: arch, > Stream: stream, > } > > The above seems kinda verbose and as I said, exposes the internals of > ImageConstraint. Are you ok with those issues if I remove the function? Yes. It's all exported, so none of it counts as internals. It's a pity about the CloudSpec: Cloudspec thing though. Another possibility would be to export a struct that had all the fields at one level, and translate into an internal form (or use that one-level form internally). It hardly matters though - this is not a function that's going to be called often, AFAICS. > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:74: var releaseVersions = > map[string]string{ > On 2013/05/15 17:06:26, rog wrote: >> >> since this is a mutable global, this should be guarded with a > > sync.Mutex. > >> How about, rather than caching the image in the image constraint, you > > read the >> >> distro info once (I think it's not unreasonable to assume that >> /usr/share/distro-info doesn't change after the program has started) > > and just >> >> look up the info via the releaseVersions map. > > >> var ( >> releaseVersionsMutex sync.Mutex >> updatedReleaseVersions bool >> releaseVersions map[string]string >> ) > > >> func (id *ImageConstraint) Id() (s string, err error) { >> vers, err := releaseVersion(ic.Release) >> etc >> } > > >> func releaseVersion(release string) (string, error) >> releaseVersionsMutex.Lock() >> defer releaseVersionsMutex.Unlock() >> if vers, ok := releaseVersions[release]; ok { >> return vers, nil >> } >> if updatedReleaseVersions { >> return "", errNotFound >> } >> err := updateDistroInfo() >> updatedReleaseVersions = true >> if err != nil { >> return "", err >> } >> if vers, ok := releaseVersions[release]; ok { >> return vers, nil >> } >> return "", errNotFound >> } > > > Done. > > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:102: // the numeric version may > contain a LTS moniker so strip that out. > On 2013/05/15 17:06:26, rog wrote: >> >> if len(parts) < 3 { >> return fmt.Errorf("syntax error in ...") >> } >> (or just ignore lines with fewer than the right number of parts). > > >> we don't want to panic because the file isn't correctly formatted. > > > Done. > > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:145: // ImageMetadata records > matching the supplied region, arch etc. > On 2013/05/15 17:06:26, rog wrote: >> >> this comment seems out of place. >> how about just: > > >> // ImageMetadata holds information about a particular cloud image. >> ? > > > Done. > > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:184: // The index file location > is as specified. > On 2013/05/15 17:06:26, rog wrote: >> >> s/specified./specified. The usual file location is DefaultIndexPath./ > > > Done. > > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:193: // fetchData gets all the > data from the given path relative to the given base URL. > On 2013/05/15 17:06:26, rog wrote: >> >> // It returns the data found and the URL used. > > > Done. > > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:224: return nil, > fmt.Errorf("cannot unmarshal JSON index metadata at URL %s: %s", url, > err.Error()) > On 2013/05/15 17:06:26, rog wrote: >> >> return nil, fmt.Errorf("cannot unmarshal JSON index metadata at URL > > %q: %v", >> >> url, err) > > >> (no need to explicitly call Error, and the url should be quoted) > > > Done. > > > https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... > environs/imagemetadata/simplestreams.go:227: return nil, > fmt.Errorf("unexpected index file format %q, expected %s at URL %s", > indices.Format, format, url) > On 2013/05/15 17:06:26, rog wrote: >> >> s/%s/%q/g > > > Done. > > https://codereview.appspot.com/9138044/
Sign in to reply to this message.
As discussed with William on a hangout, this code is just for the parsing of the simplestreams data. The multiple arch stuff is plugged in downstream. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... File environs/imagemetadata/simplestreams.go (right): https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:28: Release string On 2013/05/16 07:54:54, fwereade wrote: > Can we call this "Series" please, for consistency with the rest of the codebase? Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:67: return "", fmt.Errorf("Invalid Ubuntu release %q", ic.Release) On 2013/05/16 07:54:54, fwereade wrote: > "invalid series %q"? Done. https://codereview.appspot.com/9138044/diff/18001/environs/imagemetadata/simp... environs/imagemetadata/simplestreams.go:87: // On non-Ubuntu systems this file won't exist butr that's expected. On 2013/05/16 06:18:52, jameinel wrote: > typo "butr" > > Should this only trap for NoExist? Given we don't really want to fail, I'm > probably fine suppressing all errors, though. Yeah, the consensus seemed to be to not fail
Sign in to reply to this message.
*** Submitted: Initial simple streams support This branch adds a simple streams package to allow image metadata to be retrieved from json encoded metadata. The process starts by reading an index file from a known URL. The URL will be a static URL for EC2 (http://cloud-images.ubuntu.com/releases), and for openstack can be published in the keystone catalog or provided from an env config. This latter work to plug in the URLs from the providers will be done separately. This branch is just the core capability. The data model used is a single implementation for all providers (so far EC2 and Openstack). Various providers use many different metadata attributes but these are ignored - only the common ones like id, region, arch, vtype etc are required to be read and used for filtering, and only the id is ultimately required by the calling business logic. The tests are written to use example data to check the various processing scenarios, but also can be run live against canonistack or EC2. The live tests are a subset of the local tests and check that basic parsing and image lookup works with real data from two providers with significant differences in the exact data attributes used. The canonistack metadata has been published in a known public bucket. R=rog, smoser, gz, jameinel, fwereade CC= https://codereview.appspot.com/9138044
Sign in to reply to this message.
|